* [RFC PATCH v2 00/12] rv: Add monitors to validate task switch
@ 2025-05-14 8:43 Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 01/12] tools/rv: Do not skip idle in trace Gabriele Monaco
` (12 more replies)
0 siblings, 13 replies; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Nam Cao
Cc: Gabriele Monaco, Tomas Glozar, Juri Lelli
I'm keeping this as RFC as I'm planning to make it ready after merging
the LTL series [1]
This series adds three monitors to the sched collection, extends and
replaces previously existing monitors:
tss => sts:
Not only prove that switches occur in scheduling context but also that
each call to the scheduler implies a switch (also a vain one) and
disables interrupts doing so.
sco:
The sched_set_state tracepoint can now be called from scheduling context
in a particular condition, that is when there is a pending signal. Adapt
the monitor to cover this scenario.
snroc:
Include sched_switch_vain as transition only possible in own context,
this model is also required to keep the following two models simple.
nrp (NEW):
* preemption requires need resched which is cleared by any switch
(includes a non optimal workaround for /nested/ preemptions)
sssw (NEW):
* suspension requires setting the task to sleepable and, after the
switch occurs, the task requires a wakeup to come back to runnable
opid (NEW):
* waking and need-resched operations occur with interrupts and
preemption disabled or in IRQ without explicitly disabling preemption
We also include some minor cleanup patches (1-5) tracepoints (6) and
preparatory fixes (10) covering some corner cases:
The series is currently based on linux-next as it requires "sched: Fix
trace_sched_switch(.prev_state)" [2].
Patch 1 fixes the behaviour of the rv tool with -s and idle tasks.
Patch 2 allows the rv tool to gracefully terminate with SIGTERM
Patch 3 adds da_handle_start_run_event_ also to per-task monitors
Patch 4 removes a trailing whitespace from the rv tracepoint string
Patch 5 returns the registration error in all DA monitor instead of 0
Patch 6 adds the need_resched and switch_vain tracepoints and adds a
parameter to the sched_set_state tracepoint
Patch 7 adapts the sco monitor to the new version of sched_set_state
Patch 8 extends the snroc model and adapts it to the new sched_set_state
Patch 9 adds the sts monitor to replace tss
Patch 10 detects race conditions when rv monitors run concurrently and
retries applying the events
Patch 11 adds the nrp and sssw monitors
Patch 12 adds the opid monitor
NOTES
The nrp and sssw monitors include workarounds for racy conditions:
* A sleeping task requires to set the state to sleepable, but in case of
a task sleeping on an rtlock, the set sleepable and wakeup events race
and we don't always see the right order:
5d..2. 107.488369: event: 639: sleepable x set_sleepable -> sleepable
4d..5. 107.488369: event: 639: sleepable x wakeup -> running (final)
5d..3. 107.488385: error: 639: switch_suspend not expected in the state running
wakeup() set_state()
state=RUNNING
trace_set_state()
trace_wakeup()
state=SLEEPING
I added a special event (switch_block) but there may be a better way.
Taking a pi_lock in rtlock_slowlock_locked when setting the state to
TASK_RTLOCK_WAIT avoids this race, although this is likely not
something we want to do.
* I consider preemption any scheduling with preempt==true and assume
this can happen only if need resched is set.
In practice, however, we may see a preemption where the flag
is not set. This can happen in one specific condition:
need_resched
preempt_schedule()
preempt_schedule_irq()
__schedule()
!need_resched
__schedule()
We start a standard preemption (e.g. from preempt_enable when the flag
is set), an interrupts occurs before we schedule and, on its exit path,
it schedules, which clears the need_resched flag.
When the preempted task runs again, we continue the standard
preemption started earlier, although the flag is no longer set.
I added a workaround to allow the model not to fail in this condition,
but it makes the model weaker. In fact, we are not proving that:
1. we don't miss any event setting need_resched
2. we don't preempt when not required
Future versions of the monitor may weaken assumptions only when we
register an interrupt.
Changes since RFC [3]:
* Remove wakeup tracepoint in try_to_block_task and use a different
flavour of sched_set_state
* Split the large srs monitor in two separate monitors for preemption
and sleep. These no longer have a concept of running task, they just
enforce the requirements for the different types of sched out.
* Restore the snroc monitor to describe the relationship between
generic sched out and sched in.
* Add opid monitor.
* Fix some build errors and cleanup.
[1] - https://lore.kernel.org/lkml/cover.1747046848.git.namcao@linutronix.de
[2] - https://lore.kernel.org/lkml/174413914101.31282.6876925851363406689.tip-bot2@tip-bot2
[3] - https://lore.kernel.org/lkml/20250404084512.98552-11-gmonaco@redhat.com
To: Ingo Molnar <mingo@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
To: Steven Rostedt <rostedt@goodmis.org>
To: Nam Cao <namcao@linutronix.de>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: Juri Lelli <jlelli@redhat.com>
Gabriele Monaco (12):
tools/rv: Do not skip idle in trace
tools/rv: Stop gracefully also on SIGTERM
rv: Add da_handle_start_run_event_ to per-task monitors
rv: Remove trailing whitespace from tracepoint string
rv: Return init error when registering monitors
sched: Adapt sched tracepoints for RV task model
rv: Adapt the sco monitor to the new set_state
rv: Extend and adapt snroc model
rv: Replace tss monitor with more complete sts
rv: Retry when da monitor detects race conditions
rv: Add nrp and sssw per-task monitors
rv: Add opid per-cpu monitor
Documentation/trace/rv/monitor_sched.rst | 296 +++++++++++++++---
include/linux/rv.h | 5 +-
include/linux/sched.h | 7 +-
include/rv/da_monitor.h | 75 ++++-
include/trace/events/sched.h | 17 +-
kernel/sched/core.c | 10 +-
kernel/trace/rv/Kconfig | 5 +-
kernel/trace/rv/Makefile | 5 +-
kernel/trace/rv/monitors/{tss => nrp}/Kconfig | 8 +-
kernel/trace/rv/monitors/nrp/nrp.c | 135 ++++++++
kernel/trace/rv/monitors/nrp/nrp.h | 51 +++
kernel/trace/rv/monitors/nrp/nrp_trace.h | 15 +
kernel/trace/rv/monitors/opid/Kconfig | 17 +
kernel/trace/rv/monitors/opid/opid.c | 151 +++++++++
kernel/trace/rv/monitors/opid/opid.h | 64 ++++
kernel/trace/rv/monitors/opid/opid_trace.h | 15 +
kernel/trace/rv/monitors/sched/sched.c | 3 +-
kernel/trace/rv/monitors/sco/sco.c | 11 +-
kernel/trace/rv/monitors/sco/sco.h | 6 +-
kernel/trace/rv/monitors/scpd/scpd.c | 3 +-
kernel/trace/rv/monitors/sncid/sncid.c | 3 +-
kernel/trace/rv/monitors/snep/snep.c | 3 +-
kernel/trace/rv/monitors/snroc/snroc.c | 15 +-
kernel/trace/rv/monitors/snroc/snroc.h | 8 +-
kernel/trace/rv/monitors/sssw/Kconfig | 15 +
kernel/trace/rv/monitors/sssw/sssw.c | 115 +++++++
kernel/trace/rv/monitors/sssw/sssw.h | 64 ++++
kernel/trace/rv/monitors/sssw/sssw_trace.h | 15 +
kernel/trace/rv/monitors/sts/Kconfig | 18 ++
kernel/trace/rv/monitors/sts/sts.c | 117 +++++++
kernel/trace/rv/monitors/sts/sts.h | 62 ++++
.../{tss/tss_trace.h => sts/sts_trace.h} | 8 +-
kernel/trace/rv/monitors/tss/tss.c | 91 ------
kernel/trace/rv/monitors/tss/tss.h | 47 ---
kernel/trace/rv/monitors/wip/wip.c | 3 +-
kernel/trace/rv/monitors/wwnr/wwnr.c | 3 +-
kernel/trace/rv/rv_trace.h | 13 +-
tools/verification/models/sched/nrp.dot | 19 ++
tools/verification/models/sched/opid.dot | 35 +++
tools/verification/models/sched/sco.dot | 1 +
tools/verification/models/sched/snroc.dot | 2 +-
tools/verification/models/sched/sssw.dot | 24 ++
tools/verification/models/sched/sts.dot | 29 ++
tools/verification/models/sched/tss.dot | 18 --
tools/verification/rv/src/in_kernel.c | 2 +-
tools/verification/rv/src/rv.c | 1 +
46 files changed, 1366 insertions(+), 264 deletions(-)
rename kernel/trace/rv/monitors/{tss => nrp}/Kconfig (63%)
create mode 100644 kernel/trace/rv/monitors/nrp/nrp.c
create mode 100644 kernel/trace/rv/monitors/nrp/nrp.h
create mode 100644 kernel/trace/rv/monitors/nrp/nrp_trace.h
create mode 100644 kernel/trace/rv/monitors/opid/Kconfig
create mode 100644 kernel/trace/rv/monitors/opid/opid.c
create mode 100644 kernel/trace/rv/monitors/opid/opid.h
create mode 100644 kernel/trace/rv/monitors/opid/opid_trace.h
create mode 100644 kernel/trace/rv/monitors/sssw/Kconfig
create mode 100644 kernel/trace/rv/monitors/sssw/sssw.c
create mode 100644 kernel/trace/rv/monitors/sssw/sssw.h
create mode 100644 kernel/trace/rv/monitors/sssw/sssw_trace.h
create mode 100644 kernel/trace/rv/monitors/sts/Kconfig
create mode 100644 kernel/trace/rv/monitors/sts/sts.c
create mode 100644 kernel/trace/rv/monitors/sts/sts.h
rename kernel/trace/rv/monitors/{tss/tss_trace.h => sts/sts_trace.h} (67%)
delete mode 100644 kernel/trace/rv/monitors/tss/tss.c
delete mode 100644 kernel/trace/rv/monitors/tss/tss.h
create mode 100644 tools/verification/models/sched/nrp.dot
create mode 100644 tools/verification/models/sched/opid.dot
create mode 100644 tools/verification/models/sched/sssw.dot
create mode 100644 tools/verification/models/sched/sts.dot
delete mode 100644 tools/verification/models/sched/tss.dot
base-commit: aa94665adc28f3fdc3de2979ac1e98bae961d6ca
--
2.49.0
^ permalink raw reply [flat|nested] 38+ messages in thread
* [RFC PATCH v2 01/12] tools/rv: Do not skip idle in trace
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 02/12] tools/rv: Stop gracefully also on SIGTERM Gabriele Monaco
` (11 subsequent siblings)
12 siblings, 0 replies; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, linux-trace-kernel
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
Currently, the userspace RV tool skips trace events triggered by the RV
tool itself, this can be changed by passing the parameter -s, which sets
the variable config_my_pid to 0 (instead of the tool's PID).
The current condition for per-task monitors (config_has_id) does not
check that config_my_pid isn't 0 to skip. In case we pass -s, we show
events triggered by RV but don't show those triggered by idle (PID 0).
Fix the condition to account this scenario.
Fixes: 6d60f89691fc ("tools/rv: Add in-kernel monitor interface")
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
tools/verification/rv/src/in_kernel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/verification/rv/src/in_kernel.c b/tools/verification/rv/src/in_kernel.c
index c0dcee795c0d..72b03bae021c 100644
--- a/tools/verification/rv/src/in_kernel.c
+++ b/tools/verification/rv/src/in_kernel.c
@@ -429,7 +429,7 @@ ikm_event_handler(struct trace_seq *s, struct tep_record *record,
tep_get_common_field_val(s, trace_event, "common_pid", record, &pid, 1);
- if (config_has_id && (config_my_pid == id))
+ if (config_my_pid && config_has_id && (config_my_pid == id))
return 0;
else if (config_my_pid && (config_my_pid == pid))
return 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 02/12] tools/rv: Stop gracefully also on SIGTERM
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 01/12] tools/rv: Do not skip idle in trace Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 03/12] rv: Add da_handle_start_run_event_ to per-task monitors Gabriele Monaco
` (10 subsequent siblings)
12 siblings, 0 replies; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, linux-trace-kernel
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
Currently the userspace RV tool starts a monitor and waits for the user
to press Ctrl-C (SIGINT) to terminate and stop the monitor.
This doesn't account for a scenario where a user starts RV in background
and simply kills it (SIGTERM unless the user specifies differently).
E.g.:
# rv mon wip &
# kill %
Would terminate RV without stopping the monitor and next RV executions
won't start correctly.
Register the signal handler used for SIGINT also to SIGTERM.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
tools/verification/rv/src/rv.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/verification/rv/src/rv.c b/tools/verification/rv/src/rv.c
index 239de054d1e0..b8fe24a87d97 100644
--- a/tools/verification/rv/src/rv.c
+++ b/tools/verification/rv/src/rv.c
@@ -191,6 +191,7 @@ int main(int argc, char **argv)
* and exit.
*/
signal(SIGINT, stop_rv);
+ signal(SIGTERM, stop_rv);
rv_mon(argc - 1, &argv[1]);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 03/12] rv: Add da_handle_start_run_event_ to per-task monitors
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 01/12] tools/rv: Do not skip idle in trace Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 02/12] tools/rv: Stop gracefully also on SIGTERM Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-19 8:02 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 04/12] rv: Remove trailing whitespace from tracepoint string Gabriele Monaco
` (9 subsequent siblings)
12 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, linux-trace-kernel
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
The RV da_monitor API allows to start monitors in two ways:
da_handle_start_event_NAME and da_handle_start_run_event_NAME.
The former is used when the event is followed by the initial state of
the module, so we ignore the event but we know the monitor is in the
initial state and can start monitoring, the latter can be used if the
event can only occur in the initial state, so we do handle the event as
if the monitor was in the initial state.
This latter API is defined for implicit monitors but not per-task ones.
Define da_handle_start_run_event_NAME macro also for per-task monitors.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
include/rv/da_monitor.h | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/include/rv/da_monitor.h b/include/rv/da_monitor.h
index 510c88bfabd4..215c3eb770cc 100644
--- a/include/rv/da_monitor.h
+++ b/include/rv/da_monitor.h
@@ -512,6 +512,30 @@ da_handle_start_event_##name(struct task_struct *tsk, enum events_##name event)
__da_handle_event_##name(da_mon, tsk, event); \
\
return 1; \
+} \
+ \
+/* \
+ * da_handle_start_run_event_##name - start monitoring and handle event \
+ * \
+ * This function is used to notify the monitor that the system is in the \
+ * initial state, so the monitor can start monitoring and handling event. \
+ */ \
+static inline bool \
+da_handle_start_run_event_##name(struct task_struct *tsk, enum events_##name event) \
+{ \
+ struct da_monitor *da_mon; \
+ \
+ if (!da_monitor_enabled_##name()) \
+ return 0; \
+ \
+ da_mon = da_get_monitor_##name(tsk); \
+ \
+ if (unlikely(!da_monitoring_##name(da_mon))) \
+ da_monitor_start_##name(da_mon); \
+ \
+ __da_handle_event_##name(da_mon, tsk, event); \
+ \
+ return 1; \
}
/*
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 04/12] rv: Remove trailing whitespace from tracepoint string
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (2 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 03/12] rv: Add da_handle_start_run_event_ to per-task monitors Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-19 8:05 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 05/12] rv: Return init error when registering monitors Gabriele Monaco
` (8 subsequent siblings)
12 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, Masami Hiramatsu,
linux-trace-kernel
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
RV event tracepoints print a line with the format:
"event_xyz: S0 x event -> S1 "
"event_xyz: S1 x event -> S0 (final)"
While printing an event leading to a non-final state, the line
has a trailing white space (visible above before the closing ").
Adapt the format string not to print the trailing whitespace if we are
not printing "(final)".
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
kernel/trace/rv/rv_trace.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/rv/rv_trace.h b/kernel/trace/rv/rv_trace.h
index 422b75f58891..18fa0e358a30 100644
--- a/kernel/trace/rv/rv_trace.h
+++ b/kernel/trace/rv/rv_trace.h
@@ -29,11 +29,11 @@ DECLARE_EVENT_CLASS(event_da_monitor,
__entry->final_state = final_state;
),
- TP_printk("%s x %s -> %s %s",
+ TP_printk("%s x %s -> %s%s",
__entry->state,
__entry->event,
__entry->next_state,
- __entry->final_state ? "(final)" : "")
+ __entry->final_state ? " (final)" : "")
);
DECLARE_EVENT_CLASS(error_da_monitor,
@@ -90,12 +90,12 @@ DECLARE_EVENT_CLASS(event_da_monitor_id,
__entry->final_state = final_state;
),
- TP_printk("%d: %s x %s -> %s %s",
+ TP_printk("%d: %s x %s -> %s%s",
__entry->id,
__entry->state,
__entry->event,
__entry->next_state,
- __entry->final_state ? "(final)" : "")
+ __entry->final_state ? " (final)" : "")
);
DECLARE_EVENT_CLASS(error_da_monitor_id,
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 05/12] rv: Return init error when registering monitors
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (3 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 04/12] rv: Remove trailing whitespace from tracepoint string Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-19 8:06 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 06/12] sched: Adapt sched tracepoints for RV task model Gabriele Monaco
` (7 subsequent siblings)
12 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, Masami Hiramatsu,
linux-trace-kernel
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
Monitors generated with dot2k have their registration function (the one
called during monitor initialisation) return always 0, even if the
registration failed on RV side.
This can hide potential errors.
Return the value returned by the RV register function.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
kernel/trace/rv/monitors/sched/sched.c | 3 +--
kernel/trace/rv/monitors/sco/sco.c | 3 +--
kernel/trace/rv/monitors/scpd/scpd.c | 3 +--
kernel/trace/rv/monitors/sncid/sncid.c | 3 +--
kernel/trace/rv/monitors/snep/snep.c | 3 +--
kernel/trace/rv/monitors/snroc/snroc.c | 3 +--
kernel/trace/rv/monitors/tss/tss.c | 3 +--
kernel/trace/rv/monitors/wip/wip.c | 3 +--
kernel/trace/rv/monitors/wwnr/wwnr.c | 3 +--
9 files changed, 9 insertions(+), 18 deletions(-)
diff --git a/kernel/trace/rv/monitors/sched/sched.c b/kernel/trace/rv/monitors/sched/sched.c
index 905e03c3c934..d04db4b543f9 100644
--- a/kernel/trace/rv/monitors/sched/sched.c
+++ b/kernel/trace/rv/monitors/sched/sched.c
@@ -21,8 +21,7 @@ struct rv_monitor rv_sched = {
static int __init register_sched(void)
{
- rv_register_monitor(&rv_sched, NULL);
- return 0;
+ return rv_register_monitor(&rv_sched, NULL);
}
static void __exit unregister_sched(void)
diff --git a/kernel/trace/rv/monitors/sco/sco.c b/kernel/trace/rv/monitors/sco/sco.c
index 4cff59220bfc..66f4639d46ac 100644
--- a/kernel/trace/rv/monitors/sco/sco.c
+++ b/kernel/trace/rv/monitors/sco/sco.c
@@ -71,8 +71,7 @@ static struct rv_monitor rv_sco = {
static int __init register_sco(void)
{
- rv_register_monitor(&rv_sco, &rv_sched);
- return 0;
+ return rv_register_monitor(&rv_sco, &rv_sched);
}
static void __exit unregister_sco(void)
diff --git a/kernel/trace/rv/monitors/scpd/scpd.c b/kernel/trace/rv/monitors/scpd/scpd.c
index cbdd6a5f8d7f..299703cd72b0 100644
--- a/kernel/trace/rv/monitors/scpd/scpd.c
+++ b/kernel/trace/rv/monitors/scpd/scpd.c
@@ -79,8 +79,7 @@ static struct rv_monitor rv_scpd = {
static int __init register_scpd(void)
{
- rv_register_monitor(&rv_scpd, &rv_sched);
- return 0;
+ return rv_register_monitor(&rv_scpd, &rv_sched);
}
static void __exit unregister_scpd(void)
diff --git a/kernel/trace/rv/monitors/sncid/sncid.c b/kernel/trace/rv/monitors/sncid/sncid.c
index f5037cd6214c..3e1ee715a0fb 100644
--- a/kernel/trace/rv/monitors/sncid/sncid.c
+++ b/kernel/trace/rv/monitors/sncid/sncid.c
@@ -79,8 +79,7 @@ static struct rv_monitor rv_sncid = {
static int __init register_sncid(void)
{
- rv_register_monitor(&rv_sncid, &rv_sched);
- return 0;
+ return rv_register_monitor(&rv_sncid, &rv_sched);
}
static void __exit unregister_sncid(void)
diff --git a/kernel/trace/rv/monitors/snep/snep.c b/kernel/trace/rv/monitors/snep/snep.c
index 0076ba6d7ea4..2adc3108d60c 100644
--- a/kernel/trace/rv/monitors/snep/snep.c
+++ b/kernel/trace/rv/monitors/snep/snep.c
@@ -79,8 +79,7 @@ static struct rv_monitor rv_snep = {
static int __init register_snep(void)
{
- rv_register_monitor(&rv_snep, &rv_sched);
- return 0;
+ return rv_register_monitor(&rv_snep, &rv_sched);
}
static void __exit unregister_snep(void)
diff --git a/kernel/trace/rv/monitors/snroc/snroc.c b/kernel/trace/rv/monitors/snroc/snroc.c
index bb1f60d55296..540e686e699f 100644
--- a/kernel/trace/rv/monitors/snroc/snroc.c
+++ b/kernel/trace/rv/monitors/snroc/snroc.c
@@ -68,8 +68,7 @@ static struct rv_monitor rv_snroc = {
static int __init register_snroc(void)
{
- rv_register_monitor(&rv_snroc, &rv_sched);
- return 0;
+ return rv_register_monitor(&rv_snroc, &rv_sched);
}
static void __exit unregister_snroc(void)
diff --git a/kernel/trace/rv/monitors/tss/tss.c b/kernel/trace/rv/monitors/tss/tss.c
index 542787e6524f..0452fcd9edcf 100644
--- a/kernel/trace/rv/monitors/tss/tss.c
+++ b/kernel/trace/rv/monitors/tss/tss.c
@@ -74,8 +74,7 @@ static struct rv_monitor rv_tss = {
static int __init register_tss(void)
{
- rv_register_monitor(&rv_tss, &rv_sched);
- return 0;
+ return rv_register_monitor(&rv_tss, &rv_sched);
}
static void __exit unregister_tss(void)
diff --git a/kernel/trace/rv/monitors/wip/wip.c b/kernel/trace/rv/monitors/wip/wip.c
index ed758fec8608..4b4e99615a11 100644
--- a/kernel/trace/rv/monitors/wip/wip.c
+++ b/kernel/trace/rv/monitors/wip/wip.c
@@ -71,8 +71,7 @@ static struct rv_monitor rv_wip = {
static int __init register_wip(void)
{
- rv_register_monitor(&rv_wip, NULL);
- return 0;
+ return rv_register_monitor(&rv_wip, NULL);
}
static void __exit unregister_wip(void)
diff --git a/kernel/trace/rv/monitors/wwnr/wwnr.c b/kernel/trace/rv/monitors/wwnr/wwnr.c
index 172f31c4b0f3..4145bea2729e 100644
--- a/kernel/trace/rv/monitors/wwnr/wwnr.c
+++ b/kernel/trace/rv/monitors/wwnr/wwnr.c
@@ -70,8 +70,7 @@ static struct rv_monitor rv_wwnr = {
static int __init register_wwnr(void)
{
- rv_register_monitor(&rv_wwnr, NULL);
- return 0;
+ return rv_register_monitor(&rv_wwnr, NULL);
}
static void __exit unregister_wwnr(void)
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 06/12] sched: Adapt sched tracepoints for RV task model
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (4 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 05/12] rv: Return init error when registering monitors Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-19 8:29 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 07/12] rv: Adapt the sco monitor to the new set_state Gabriele Monaco
` (6 subsequent siblings)
12 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Masami Hiramatsu, linux-trace-kernel
Cc: Gabriele Monaco, Nam Cao, Tomas Glozar, Juri Lelli
Add the following tracepoints:
* sched_set_need_resched(tsk, cpu, tif)
Called when a task is set the need resched [lazy] flag
* sched_switch_vain(preempt, tsk, tsk_state)
Called when a task is selected again during __schedule
i.e. prev == next == tsk : no real context switch
Add new parameter to sched_set_state to identify whether the state
change was due to an explicit call or a signal pending while scheduling.
We now also trace from try_to_block_task in case a signal was pending
and the task is set to runnable.
These tracepoints are useful to describe the Linux task model and are
adapted from the patches by Daniel Bristot de Oliveira
(https://bristot.me/linux-task-model/).
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
include/linux/sched.h | 7 ++++++-
include/trace/events/sched.h | 17 +++++++++++++++--
kernel/sched/core.c | 10 +++++++++-
3 files changed, 30 insertions(+), 4 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 04f808ab8888..4d9da32330bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -340,9 +340,11 @@ extern void io_schedule_finish(int token);
extern long io_schedule_timeout(long timeout);
extern void io_schedule(void);
-/* wrapper function to trace from this header file */
+/* wrapper functions to trace from this header file */
DECLARE_TRACEPOINT(sched_set_state_tp);
extern void __trace_set_current_state(int state_value);
+DECLARE_TRACEPOINT(sched_set_need_resched_tp);
+extern void __trace_set_need_resched(struct task_struct *curr, int tif);
/**
* struct prev_cputime - snapshot of system and user cputime
@@ -2065,6 +2067,9 @@ static inline int test_tsk_thread_flag(struct task_struct *tsk, int flag)
static inline void set_tsk_need_resched(struct task_struct *tsk)
{
+ if (tracepoint_enabled(sched_set_need_resched_tp) &&
+ !test_tsk_thread_flag(tsk, TIF_NEED_RESCHED))
+ __trace_set_need_resched(tsk, TIF_NEED_RESCHED);
set_tsk_thread_flag(tsk,TIF_NEED_RESCHED);
}
diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 2390818b139b..158b9c504fab 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -889,11 +889,24 @@ DECLARE_TRACE(sched_exit_tp,
TP_PROTO(bool is_switch, unsigned long ip),
TP_ARGS(is_switch, ip));
+/*
+ * Tracepoint called when setting the state of a task;
+ * this tracepoint is guaranteed to be called from the waking context of the
+ * task setting the state.
+ */
DECLARE_TRACE_CONDITION(sched_set_state_tp,
- TP_PROTO(struct task_struct *tsk, int state),
- TP_ARGS(tsk, state),
+ TP_PROTO(struct task_struct *tsk, int state, bool from_signal),
+ TP_ARGS(tsk, state, from_signal),
TP_CONDITION(!!(tsk->__state) != !!state));
+DECLARE_TRACE(sched_set_need_resched_tp,
+ TP_PROTO(struct task_struct *tsk, int cpu, int tif),
+ TP_ARGS(tsk, cpu, tif));
+
+DECLARE_TRACE(sched_switch_vain_tp,
+ TP_PROTO(bool preempt, struct task_struct *tsk, unsigned int prev_state),
+ TP_ARGS(preempt, tsk, prev_state));
+
#endif /* _TRACE_SCHED_H */
/* This part must be outside protection */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 5f844bae1a14..89e81fc7f393 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -494,7 +494,7 @@ EXPORT_TRACEPOINT_SYMBOL(sched_set_state_tp);
/* Call via the helper macro trace_set_current_state. */
void __trace_set_current_state(int state_value)
{
- trace_sched_set_state_tp(current, state_value);
+ trace_sched_set_state_tp(current, state_value, false);
}
EXPORT_SYMBOL(__trace_set_current_state);
@@ -1109,6 +1109,7 @@ static void __resched_curr(struct rq *rq, int tif)
cpu = cpu_of(rq);
+ trace_sched_set_need_resched_tp(curr, cpu, tif);
if (cpu == smp_processor_id()) {
set_ti_thread_flag(cti, tif);
if (tif == TIF_NEED_RESCHED)
@@ -1124,6 +1125,11 @@ static void __resched_curr(struct rq *rq, int tif)
}
}
+void __trace_set_need_resched(struct task_struct *curr, int tif)
+{
+ trace_sched_set_need_resched_tp(curr, smp_processor_id(), tif);
+}
+
void resched_curr(struct rq *rq)
{
__resched_curr(rq, TIF_NEED_RESCHED);
@@ -6587,6 +6593,7 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p,
int flags = DEQUEUE_NOCLOCK;
if (signal_pending_state(task_state, p)) {
+ trace_sched_set_state_tp(p, TASK_RUNNING, true);
WRITE_ONCE(p->__state, TASK_RUNNING);
*task_state_p = TASK_RUNNING;
return false;
@@ -6779,6 +6786,7 @@ static void __sched notrace __schedule(int sched_mode)
rq = context_switch(rq, prev, next, &rf);
} else {
rq_unpin_lock(rq, &rf);
+ trace_sched_switch_vain_tp(preempt, prev, prev_state);
__balance_callbacks(rq);
raw_spin_rq_unlock_irq(rq);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 07/12] rv: Adapt the sco monitor to the new set_state
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (5 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 06/12] sched: Adapt sched tracepoints for RV task model Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-19 8:42 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 08/12] rv: Extend and adapt snroc model Gabriele Monaco
` (5 subsequent siblings)
12 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
The sched_set_state tracepoint changed prototype adding a new argument,
this argument can differentiate between an explicit set_state called by
a task and a set state to runnable by the scheduler due to a pending
signal.
Adapt the handlers prototypes for the sco monitor.
Expand the model to handle the new set_state flavour, the monitor was
making sure set state happens only outside of the scheduler, if the
event occurs with the new argument (from_signal) set to true, we instead
expect it to be inside the scheduler.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
Documentation/trace/rv/monitor_sched.rst | 35 +++++++++++++-----------
kernel/trace/rv/monitors/sco/sco.c | 8 ++++--
kernel/trace/rv/monitors/sco/sco.h | 6 ++--
tools/verification/models/sched/sco.dot | 1 +
4 files changed, 30 insertions(+), 20 deletions(-)
diff --git a/Documentation/trace/rv/monitor_sched.rst b/Documentation/trace/rv/monitor_sched.rst
index 24b2c62a3bc2..6f76bba94d9f 100644
--- a/Documentation/trace/rv/monitor_sched.rst
+++ b/Documentation/trace/rv/monitor_sched.rst
@@ -64,22 +64,25 @@ Monitor sco
~~~~~~~~~~~
The scheduling context operations (sco) monitor ensures changes in a task state
-happen only in thread context::
-
-
- |
- |
- v
- sched_set_state +------------------+
- +------------------ | |
- | | thread_context |
- +-----------------> | | <+
- +------------------+ |
- | |
- | schedule_entry | schedule_exit
- v |
- |
- scheduling_context -+
+happen only in thread context, the only exception is a special kind of set
+state that occurs if a task about to sleep has a pending signal. This set state
+is not called by the thread but by the scheduler itself::
+
+ |
+ |
+ v
+ sched_set_state +------------------+
+ +---------------------------------- | |
+ | | thread_context |
+ +---------------------------------> | | <+
+ +------------------+ |
+ | |
+ | schedule_entry | schedule_exit
+ v |
+ sched_set_state_runnable_signal |
+ +---------------------------------- |
+ | scheduling_context |
+ +---------------------------------> -+
Monitor snroc
~~~~~~~~~~~~~
diff --git a/kernel/trace/rv/monitors/sco/sco.c b/kernel/trace/rv/monitors/sco/sco.c
index 66f4639d46ac..6457ff2469d0 100644
--- a/kernel/trace/rv/monitors/sco/sco.c
+++ b/kernel/trace/rv/monitors/sco/sco.c
@@ -19,9 +19,13 @@
static struct rv_monitor rv_sco;
DECLARE_DA_MON_PER_CPU(sco, unsigned char);
-static void handle_sched_set_state(void *data, struct task_struct *tsk, int state)
+static void handle_sched_set_state(void *data, struct task_struct *tsk,
+ int state, bool from_signal)
{
- da_handle_start_event_sco(sched_set_state_sco);
+ if (from_signal)
+ da_handle_event_sco(sched_set_state_runnable_signal_sco);
+ else
+ da_handle_start_event_sco(sched_set_state_sco);
}
static void handle_schedule_entry(void *data, bool preempt, unsigned long ip)
diff --git a/kernel/trace/rv/monitors/sco/sco.h b/kernel/trace/rv/monitors/sco/sco.h
index 7a4c1f2d5ca1..302750687f9c 100644
--- a/kernel/trace/rv/monitors/sco/sco.h
+++ b/kernel/trace/rv/monitors/sco/sco.h
@@ -15,6 +15,7 @@ enum states_sco {
enum events_sco {
sched_set_state_sco = 0,
+ sched_set_state_runnable_signal_sco,
schedule_entry_sco,
schedule_exit_sco,
event_max_sco
@@ -35,12 +36,13 @@ static const struct automaton_sco automaton_sco = {
},
.event_names = {
"sched_set_state",
+ "sched_set_state_runnable_signal",
"schedule_entry",
"schedule_exit"
},
.function = {
- { thread_context_sco, scheduling_context_sco, INVALID_STATE },
- { INVALID_STATE, INVALID_STATE, thread_context_sco },
+ { thread_context_sco, INVALID_STATE, scheduling_context_sco, INVALID_STATE },
+ { INVALID_STATE, scheduling_context_sco, INVALID_STATE, thread_context_sco },
},
.initial_state = thread_context_sco,
.final_states = { 1, 0 },
diff --git a/tools/verification/models/sched/sco.dot b/tools/verification/models/sched/sco.dot
index 20b0e3b449a6..4e44ed58c62a 100644
--- a/tools/verification/models/sched/sco.dot
+++ b/tools/verification/models/sched/sco.dot
@@ -7,6 +7,7 @@ digraph state_automaton {
{node [shape = plaintext] "thread_context"};
"__init_thread_context" -> "thread_context";
"scheduling_context" [label = "scheduling_context"];
+ "scheduling_context" -> "scheduling_context" [ label = "sched_set_state_runnable_signal" ];
"scheduling_context" -> "thread_context" [ label = "schedule_exit" ];
"thread_context" [label = "thread_context", color = green3];
"thread_context" -> "scheduling_context" [ label = "schedule_entry" ];
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 08/12] rv: Extend and adapt snroc model
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (6 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 07/12] rv: Adapt the sco monitor to the new set_state Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts Gabriele Monaco
` (4 subsequent siblings)
12 siblings, 0 replies; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
The snroc monitor ensures changes in a task state happens only in the
respective task's context. This is the only monitor enforcing a task
is to be switched in after being switched out, and vice-versa.
Although this is a trivial claim, it is useful to complete other claims
on when scheduling is possible while keeping models simple.
Add the sched_switch_vain event to the model, enforcing that a vain
switch doesn't change the context but can only occur while a task is
running (e.g. a call to schedule that re-selects the current task).
Also adapt the set_state handler prototypes to the tracepoint change.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
Documentation/trace/rv/monitor_sched.rst | 32 ++++++++++++-----------
kernel/trace/rv/monitors/snroc/snroc.c | 12 ++++++++-
kernel/trace/rv/monitors/snroc/snroc.h | 8 +++---
tools/verification/models/sched/snroc.dot | 2 +-
4 files changed, 34 insertions(+), 20 deletions(-)
diff --git a/Documentation/trace/rv/monitor_sched.rst b/Documentation/trace/rv/monitor_sched.rst
index 6f76bba94d9f..5cb58ac441d8 100644
--- a/Documentation/trace/rv/monitor_sched.rst
+++ b/Documentation/trace/rv/monitor_sched.rst
@@ -89,21 +89,23 @@ Monitor snroc
The set non runnable on its own context (snroc) monitor ensures changes in a
task state happens only in the respective task's context. This is a per-task
-monitor::
-
- |
- |
- v
- +------------------+
- | other_context | <+
- +------------------+ |
- | |
- | sched_switch_in | sched_switch_out
- v |
- sched_set_state |
- +------------------ |
- | own_context |
- +-----------------> -+
+monitor. A task is in its own context after switching in and leaves the context
+when switched out, a vain switch maintains the context::
+
+ |
+ |
+ v
+ +------------------+
+ | other_context | <+
+ +------------------+ |
+ | |
+ | sched_switch_in | sched_switch_out
+ v |
+ sched_set_state |
+ sched_switch_vain |
+ +-------------------- own_context |
+ | |
+ +-------------------> -+
Monitor scpd
~~~~~~~~~~~~
diff --git a/kernel/trace/rv/monitors/snroc/snroc.c b/kernel/trace/rv/monitors/snroc/snroc.c
index 540e686e699f..11a56b58665e 100644
--- a/kernel/trace/rv/monitors/snroc/snroc.c
+++ b/kernel/trace/rv/monitors/snroc/snroc.c
@@ -19,7 +19,8 @@
static struct rv_monitor rv_snroc;
DECLARE_DA_MON_PER_TASK(snroc, unsigned char);
-static void handle_sched_set_state(void *data, struct task_struct *tsk, int state)
+static void handle_sched_set_state(void *data, struct task_struct *tsk,
+ int state, bool from_signal)
{
da_handle_event_snroc(tsk, sched_set_state_snroc);
}
@@ -33,6 +34,13 @@ static void handle_sched_switch(void *data, bool preempt,
da_handle_event_snroc(next, sched_switch_in_snroc);
}
+static void handle_sched_switch_vain(void *data, bool preempt,
+ struct task_struct *tsk,
+ unsigned int tsk_state)
+{
+ da_handle_event_snroc(tsk, sched_switch_vain_snroc);
+}
+
static int enable_snroc(void)
{
int retval;
@@ -43,6 +51,7 @@ static int enable_snroc(void)
rv_attach_trace_probe("snroc", sched_set_state_tp, handle_sched_set_state);
rv_attach_trace_probe("snroc", sched_switch, handle_sched_switch);
+ rv_attach_trace_probe("snroc", sched_switch_vain_tp, handle_sched_switch_vain);
return 0;
}
@@ -53,6 +62,7 @@ static void disable_snroc(void)
rv_detach_trace_probe("snroc", sched_set_state_tp, handle_sched_set_state);
rv_detach_trace_probe("snroc", sched_switch, handle_sched_switch);
+ rv_detach_trace_probe("snroc", sched_switch_vain_tp, handle_sched_switch_vain);
da_monitor_destroy_snroc();
}
diff --git a/kernel/trace/rv/monitors/snroc/snroc.h b/kernel/trace/rv/monitors/snroc/snroc.h
index c3650a2b1b10..d6de40e15ae1 100644
--- a/kernel/trace/rv/monitors/snroc/snroc.h
+++ b/kernel/trace/rv/monitors/snroc/snroc.h
@@ -17,6 +17,7 @@ enum events_snroc {
sched_set_state_snroc = 0,
sched_switch_in_snroc,
sched_switch_out_snroc,
+ sched_switch_vain_snroc,
event_max_snroc
};
@@ -36,11 +37,12 @@ static const struct automaton_snroc automaton_snroc = {
.event_names = {
"sched_set_state",
"sched_switch_in",
- "sched_switch_out"
+ "sched_switch_out",
+ "sched_switch_vain"
},
.function = {
- { INVALID_STATE, own_context_snroc, INVALID_STATE },
- { own_context_snroc, INVALID_STATE, other_context_snroc },
+ { INVALID_STATE, own_context_snroc, INVALID_STATE, INVALID_STATE },
+ { own_context_snroc, INVALID_STATE, other_context_snroc, own_context_snroc },
},
.initial_state = other_context_snroc,
.final_states = { 1, 0 },
diff --git a/tools/verification/models/sched/snroc.dot b/tools/verification/models/sched/snroc.dot
index 8b71c32d4dca..5b1a787d0350 100644
--- a/tools/verification/models/sched/snroc.dot
+++ b/tools/verification/models/sched/snroc.dot
@@ -10,7 +10,7 @@ digraph state_automaton {
"other_context" -> "own_context" [ label = "sched_switch_in" ];
"own_context" [label = "own_context"];
"own_context" -> "other_context" [ label = "sched_switch_out" ];
- "own_context" -> "own_context" [ label = "sched_set_state" ];
+ "own_context" -> "own_context" [ label = "sched_set_state\nsched_switch_vain" ];
{ rank = min ;
"__init_other_context";
"other_context";
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (7 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 08/12] rv: Extend and adapt snroc model Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-06-24 7:36 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions Gabriele Monaco
` (3 subsequent siblings)
12 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
The tss monitor currently guarantees context switches can happen only
while scheduling, but it doesn't enforce that each scheduling call
implies a task switch. This can be implied only if we rely on the newly
introduced sched_switch_vain tracepoint, which represents a
scheduler call where the previously running task is the same that is
picked to run next, in fact no context is switched.
Replace the monitor with a more comprehensive specification which
implies tss but also ensures that:
* each scheduler call switches context (or has a vain switch)
* each context switch happens with interrupts disabled
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
Documentation/trace/rv/monitor_sched.rst | 68 +++++++---
kernel/trace/rv/Kconfig | 2 +-
kernel/trace/rv/Makefile | 2 +-
kernel/trace/rv/monitors/sts/Kconfig | 18 +++
kernel/trace/rv/monitors/sts/sts.c | 117 ++++++++++++++++++
kernel/trace/rv/monitors/sts/sts.h | 62 ++++++++++
.../{tss/tss_trace.h => sts/sts_trace.h} | 8 +-
kernel/trace/rv/monitors/tss/Kconfig | 14 ---
kernel/trace/rv/monitors/tss/tss.c | 90 --------------
kernel/trace/rv/monitors/tss/tss.h | 47 -------
kernel/trace/rv/rv_trace.h | 2 +-
tools/verification/models/sched/sts.dot | 29 +++++
tools/verification/models/sched/tss.dot | 18 ---
13 files changed, 281 insertions(+), 196 deletions(-)
create mode 100644 kernel/trace/rv/monitors/sts/Kconfig
create mode 100644 kernel/trace/rv/monitors/sts/sts.c
create mode 100644 kernel/trace/rv/monitors/sts/sts.h
rename kernel/trace/rv/monitors/{tss/tss_trace.h => sts/sts_trace.h} (67%)
delete mode 100644 kernel/trace/rv/monitors/tss/Kconfig
delete mode 100644 kernel/trace/rv/monitors/tss/tss.c
delete mode 100644 kernel/trace/rv/monitors/tss/tss.h
create mode 100644 tools/verification/models/sched/sts.dot
delete mode 100644 tools/verification/models/sched/tss.dot
diff --git a/Documentation/trace/rv/monitor_sched.rst b/Documentation/trace/rv/monitor_sched.rst
index 5cb58ac441d8..e4171aadef1c 100644
--- a/Documentation/trace/rv/monitor_sched.rst
+++ b/Documentation/trace/rv/monitor_sched.rst
@@ -40,26 +40,6 @@ defined in by Daniel Bristot in [1].
Currently we included the following:
-Monitor tss
-~~~~~~~~~~~
-
-The task switch while scheduling (tss) monitor ensures a task switch happens
-only in scheduling context, that is inside a call to `__schedule`::
-
- |
- |
- v
- +-----------------+
- | thread | <+
- +-----------------+ |
- | |
- | schedule_entry | schedule_exit
- v |
- sched_switch |
- +--------------- |
- | sched |
- +--------------> -+
-
Monitor sco
~~~~~~~~~~~
@@ -170,6 +150,54 @@ schedule is not called with interrupt disabled::
|
cant_sched -+
+Monitor sts
+~~~~~~~~~~~
+
+The schedule implies task switch (sts) monitor ensures a task switch happens in
+every scheduling context, that is inside a call to ``__schedule``, as well as no
+task switch can happen without scheduling and before interrupts are disabled.
+This require the special type of switch called vain, which occurs when the next
+task picked for execution is the same as the previously running one, in fact no
+real task switch occurs::
+
+ |
+ |
+ v
+ #====================# irq_disable
+ H H irq_enable
+ H thread H --------------+
+ H H |
+ +-------------> H H <-------------+
+ | #====================#
+ | |
+ | | schedule_entry
+ | v
+ | +--------------------+
+ | | scheduling | <+
+ | +--------------------+ |
+ | | |
+ | | irq_disable | irq_enable
+ | v |
+ | +--------------------+ |
+ | | disable_to_switch | -+
+ | schedule_exit +--------------------+
+ | |
+ | | sched_switch
+ | | sched_switch_vain
+ | v
+ | +--------------------+
+ | | switching |
+ | +--------------------+
+ | |
+ | | irq_enable
+ | v
+ | +--------------------+ irq_disable
+ | | | irq_enable
+ | | enable_to_exit | --------------+
+ | | | |
+ +-------------- | | <-------------+
+ +--------------------+
+
References
----------
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index b39f36013ef2..a53a3eca9616 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -28,12 +28,12 @@ menuconfig RV
source "kernel/trace/rv/monitors/wip/Kconfig"
source "kernel/trace/rv/monitors/wwnr/Kconfig"
source "kernel/trace/rv/monitors/sched/Kconfig"
-source "kernel/trace/rv/monitors/tss/Kconfig"
source "kernel/trace/rv/monitors/sco/Kconfig"
source "kernel/trace/rv/monitors/snroc/Kconfig"
source "kernel/trace/rv/monitors/scpd/Kconfig"
source "kernel/trace/rv/monitors/snep/Kconfig"
source "kernel/trace/rv/monitors/sncid/Kconfig"
+source "kernel/trace/rv/monitors/sts/Kconfig"
# Add new monitors here
config RV_REACTORS
diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
index f9b2cd0483c3..c609b72275cb 100644
--- a/kernel/trace/rv/Makefile
+++ b/kernel/trace/rv/Makefile
@@ -6,12 +6,12 @@ obj-$(CONFIG_RV) += rv.o
obj-$(CONFIG_RV_MON_WIP) += monitors/wip/wip.o
obj-$(CONFIG_RV_MON_WWNR) += monitors/wwnr/wwnr.o
obj-$(CONFIG_RV_MON_SCHED) += monitors/sched/sched.o
-obj-$(CONFIG_RV_MON_TSS) += monitors/tss/tss.o
obj-$(CONFIG_RV_MON_SCO) += monitors/sco/sco.o
obj-$(CONFIG_RV_MON_SNROC) += monitors/snroc/snroc.o
obj-$(CONFIG_RV_MON_SCPD) += monitors/scpd/scpd.o
obj-$(CONFIG_RV_MON_SNEP) += monitors/snep/snep.o
obj-$(CONFIG_RV_MON_SNCID) += monitors/sncid/sncid.o
+obj-$(CONFIG_RV_MON_STS) += monitors/sts/sts.o
# Add new monitors here
obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
diff --git a/kernel/trace/rv/monitors/sts/Kconfig b/kernel/trace/rv/monitors/sts/Kconfig
new file mode 100644
index 000000000000..5b486dac3f10
--- /dev/null
+++ b/kernel/trace/rv/monitors/sts/Kconfig
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_STS
+ depends on RV
+ depends on IRQSOFF_TRACER
+ depends on RV_MON_SCHED
+ default y
+ select DA_MON_EVENTS_IMPLICIT
+ bool "sts monitor"
+ help
+ Monitor to ensure relationships between scheduler and switches
+ * each call to the scheduler implies a switch
+ * switches only happen inside the scheduler
+ * switches happen with interrupt disabled
+ This monitor is part of the sched monitors collection.
+
+ For further information, see:
+ Documentation/trace/rv/monitor_sched.rst
diff --git a/kernel/trace/rv/monitors/sts/sts.c b/kernel/trace/rv/monitors/sts/sts.c
new file mode 100644
index 000000000000..deb18a9d48f3
--- /dev/null
+++ b/kernel/trace/rv/monitors/sts/sts.c
@@ -0,0 +1,117 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+#include <rv/da_monitor.h>
+
+#define MODULE_NAME "sts"
+
+#include <trace/events/sched.h>
+#include <trace/events/preemptirq.h>
+#include <rv_trace.h>
+#include <monitors/sched/sched.h>
+
+#include "sts.h"
+
+static struct rv_monitor rv_sts;
+DECLARE_DA_MON_PER_CPU(sts, unsigned char);
+
+static void handle_irq_disable(void *data, unsigned long ip, unsigned long parent_ip)
+{
+ da_handle_event_sts(irq_disable_sts);
+}
+
+static void handle_irq_enable(void *data, unsigned long ip, unsigned long parent_ip)
+{
+ da_handle_event_sts(irq_enable_sts);
+}
+
+static void handle_sched_switch(void *data, bool preempt,
+ struct task_struct *prev,
+ struct task_struct *next,
+ unsigned int prev_state)
+{
+ da_handle_event_sts(sched_switch_sts);
+}
+
+static void handle_sched_switch_vain(void *data, bool preempt,
+ struct task_struct *tsk,
+ unsigned int tsk_state)
+{
+ da_handle_event_sts(sched_switch_vain_sts);
+}
+
+static void handle_schedule_entry(void *data, bool preempt, unsigned long ip)
+{
+ da_handle_event_sts(schedule_entry_sts);
+}
+
+static void handle_schedule_exit(void *data, bool is_switch, unsigned long ip)
+{
+ da_handle_start_event_sts(schedule_exit_sts);
+}
+
+static int enable_sts(void)
+{
+ int retval;
+
+ retval = da_monitor_init_sts();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("sts", irq_disable, handle_irq_disable);
+ rv_attach_trace_probe("sts", irq_enable, handle_irq_enable);
+ rv_attach_trace_probe("sts", sched_switch, handle_sched_switch);
+ rv_attach_trace_probe("sts", sched_switch_vain_tp, handle_sched_switch_vain);
+ rv_attach_trace_probe("sts", sched_entry_tp, handle_schedule_entry);
+ rv_attach_trace_probe("sts", sched_exit_tp, handle_schedule_exit);
+
+ return 0;
+}
+
+static void disable_sts(void)
+{
+ rv_sts.enabled = 0;
+
+ rv_detach_trace_probe("sts", irq_disable, handle_irq_disable);
+ rv_detach_trace_probe("sts", irq_enable, handle_irq_enable);
+ rv_detach_trace_probe("sts", sched_switch, handle_sched_switch);
+ rv_detach_trace_probe("sts", sched_switch_vain_tp, handle_sched_switch_vain);
+ rv_detach_trace_probe("sts", sched_entry_tp, handle_schedule_entry);
+ rv_detach_trace_probe("sts", sched_exit_tp, handle_schedule_exit);
+
+ da_monitor_destroy_sts();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_sts = {
+ .name = "sts",
+ .description = "schedule implies task switch.",
+ .enable = enable_sts,
+ .disable = disable_sts,
+ .reset = da_monitor_reset_all_sts,
+ .enabled = 0,
+};
+
+static int __init register_sts(void)
+{
+ return rv_register_monitor(&rv_sts, &rv_sched);
+}
+
+static void __exit unregister_sts(void)
+{
+ rv_unregister_monitor(&rv_sts);
+}
+
+module_init(register_sts);
+module_exit(unregister_sts);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
+MODULE_DESCRIPTION("sts: schedule implies task switch.");
diff --git a/kernel/trace/rv/monitors/sts/sts.h b/kernel/trace/rv/monitors/sts/sts.h
new file mode 100644
index 000000000000..6921c0042293
--- /dev/null
+++ b/kernel/trace/rv/monitors/sts/sts.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of sts automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+enum states_sts {
+ thread_sts = 0,
+ disable_to_switch_sts,
+ enable_to_exit_sts,
+ scheduling_sts,
+ switching_sts,
+ state_max_sts
+};
+
+#define INVALID_STATE state_max_sts
+
+enum events_sts {
+ irq_disable_sts = 0,
+ irq_enable_sts,
+ sched_switch_sts,
+ sched_switch_vain_sts,
+ schedule_entry_sts,
+ schedule_exit_sts,
+ event_max_sts
+};
+
+struct automaton_sts {
+ char *state_names[state_max_sts];
+ char *event_names[event_max_sts];
+ unsigned char function[state_max_sts][event_max_sts];
+ unsigned char initial_state;
+ bool final_states[state_max_sts];
+};
+
+static const struct automaton_sts automaton_sts = {
+ .state_names = {
+ "thread",
+ "disable_to_switch",
+ "enable_to_exit",
+ "scheduling",
+ "switching"
+ },
+ .event_names = {
+ "irq_disable",
+ "irq_enable",
+ "sched_switch",
+ "sched_switch_vain",
+ "schedule_entry",
+ "schedule_exit"
+ },
+ .function = {
+ { thread_sts, thread_sts, INVALID_STATE, INVALID_STATE, scheduling_sts, INVALID_STATE },
+ { INVALID_STATE, scheduling_sts, switching_sts, switching_sts, INVALID_STATE, INVALID_STATE },
+ { enable_to_exit_sts, enable_to_exit_sts, INVALID_STATE, INVALID_STATE, INVALID_STATE, thread_sts },
+ { disable_to_switch_sts, INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE },
+ { INVALID_STATE, enable_to_exit_sts, INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE },
+ },
+ .initial_state = thread_sts,
+ .final_states = { 1, 0, 0, 0, 0 },
+};
diff --git a/kernel/trace/rv/monitors/tss/tss_trace.h b/kernel/trace/rv/monitors/sts/sts_trace.h
similarity index 67%
rename from kernel/trace/rv/monitors/tss/tss_trace.h
rename to kernel/trace/rv/monitors/sts/sts_trace.h
index 4619dbb50cc0..d78beb58d5b3 100644
--- a/kernel/trace/rv/monitors/tss/tss_trace.h
+++ b/kernel/trace/rv/monitors/sts/sts_trace.h
@@ -4,12 +4,12 @@
* Snippet to be included in rv_trace.h
*/
-#ifdef CONFIG_RV_MON_TSS
-DEFINE_EVENT(event_da_monitor, event_tss,
+#ifdef CONFIG_RV_MON_STS
+DEFINE_EVENT(event_da_monitor, event_sts,
TP_PROTO(char *state, char *event, char *next_state, bool final_state),
TP_ARGS(state, event, next_state, final_state));
-DEFINE_EVENT(error_da_monitor, error_tss,
+DEFINE_EVENT(error_da_monitor, error_sts,
TP_PROTO(char *state, char *event),
TP_ARGS(state, event));
-#endif /* CONFIG_RV_MON_TSS */
+#endif /* CONFIG_RV_MON_STS */
diff --git a/kernel/trace/rv/monitors/tss/Kconfig b/kernel/trace/rv/monitors/tss/Kconfig
deleted file mode 100644
index 479f86f52e60..000000000000
--- a/kernel/trace/rv/monitors/tss/Kconfig
+++ /dev/null
@@ -1,14 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-#
-config RV_MON_TSS
- depends on RV
- depends on RV_MON_SCHED
- default y
- select DA_MON_EVENTS_IMPLICIT
- bool "tss monitor"
- help
- Monitor to ensure sched_switch happens only in scheduling context.
- This monitor is part of the sched monitors collection.
-
- For further information, see:
- Documentation/trace/rv/monitor_sched.rst
diff --git a/kernel/trace/rv/monitors/tss/tss.c b/kernel/trace/rv/monitors/tss/tss.c
deleted file mode 100644
index 0452fcd9edcf..000000000000
--- a/kernel/trace/rv/monitors/tss/tss.c
+++ /dev/null
@@ -1,90 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include <linux/ftrace.h>
-#include <linux/tracepoint.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/init.h>
-#include <linux/rv.h>
-#include <rv/instrumentation.h>
-#include <rv/da_monitor.h>
-
-#define MODULE_NAME "tss"
-
-#include <trace/events/sched.h>
-#include <rv_trace.h>
-#include <monitors/sched/sched.h>
-
-#include "tss.h"
-
-static struct rv_monitor rv_tss;
-DECLARE_DA_MON_PER_CPU(tss, unsigned char);
-
-static void handle_sched_switch(void *data, bool preempt,
- struct task_struct *prev,
- struct task_struct *next,
- unsigned int prev_state)
-{
- da_handle_event_tss(sched_switch_tss);
-}
-
-static void handle_schedule_entry(void *data, bool preempt, unsigned long ip)
-{
- da_handle_event_tss(schedule_entry_tss);
-}
-
-static void handle_schedule_exit(void *data, bool is_switch, unsigned long ip)
-{
- da_handle_start_event_tss(schedule_exit_tss);
-}
-
-static int enable_tss(void)
-{
- int retval;
-
- retval = da_monitor_init_tss();
- if (retval)
- return retval;
-
- rv_attach_trace_probe("tss", sched_switch, handle_sched_switch);
- rv_attach_trace_probe("tss", sched_entry_tp, handle_schedule_entry);
- rv_attach_trace_probe("tss", sched_exit_tp, handle_schedule_exit);
-
- return 0;
-}
-
-static void disable_tss(void)
-{
- rv_tss.enabled = 0;
-
- rv_detach_trace_probe("tss", sched_switch, handle_sched_switch);
- rv_detach_trace_probe("tss", sched_entry_tp, handle_schedule_entry);
- rv_detach_trace_probe("tss", sched_exit_tp, handle_schedule_exit);
-
- da_monitor_destroy_tss();
-}
-
-static struct rv_monitor rv_tss = {
- .name = "tss",
- .description = "task switch while scheduling.",
- .enable = enable_tss,
- .disable = disable_tss,
- .reset = da_monitor_reset_all_tss,
- .enabled = 0,
-};
-
-static int __init register_tss(void)
-{
- return rv_register_monitor(&rv_tss, &rv_sched);
-}
-
-static void __exit unregister_tss(void)
-{
- rv_unregister_monitor(&rv_tss);
-}
-
-module_init(register_tss);
-module_exit(unregister_tss);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
-MODULE_DESCRIPTION("tss: task switch while scheduling.");
diff --git a/kernel/trace/rv/monitors/tss/tss.h b/kernel/trace/rv/monitors/tss/tss.h
deleted file mode 100644
index f0a36fda1b87..000000000000
--- a/kernel/trace/rv/monitors/tss/tss.h
+++ /dev/null
@@ -1,47 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Automatically generated C representation of tss automaton
- * For further information about this format, see kernel documentation:
- * Documentation/trace/rv/deterministic_automata.rst
- */
-
-enum states_tss {
- thread_tss = 0,
- sched_tss,
- state_max_tss
-};
-
-#define INVALID_STATE state_max_tss
-
-enum events_tss {
- sched_switch_tss = 0,
- schedule_entry_tss,
- schedule_exit_tss,
- event_max_tss
-};
-
-struct automaton_tss {
- char *state_names[state_max_tss];
- char *event_names[event_max_tss];
- unsigned char function[state_max_tss][event_max_tss];
- unsigned char initial_state;
- bool final_states[state_max_tss];
-};
-
-static const struct automaton_tss automaton_tss = {
- .state_names = {
- "thread",
- "sched"
- },
- .event_names = {
- "sched_switch",
- "schedule_entry",
- "schedule_exit"
- },
- .function = {
- { INVALID_STATE, sched_tss, INVALID_STATE },
- { sched_tss, INVALID_STATE, thread_tss },
- },
- .initial_state = thread_tss,
- .final_states = { 1, 0 },
-};
diff --git a/kernel/trace/rv/rv_trace.h b/kernel/trace/rv/rv_trace.h
index 18fa0e358a30..a5e1c52e2992 100644
--- a/kernel/trace/rv/rv_trace.h
+++ b/kernel/trace/rv/rv_trace.h
@@ -58,11 +58,11 @@ DECLARE_EVENT_CLASS(error_da_monitor,
);
#include <monitors/wip/wip_trace.h>
-#include <monitors/tss/tss_trace.h>
#include <monitors/sco/sco_trace.h>
#include <monitors/scpd/scpd_trace.h>
#include <monitors/snep/snep_trace.h>
#include <monitors/sncid/sncid_trace.h>
+#include <monitors/sts/sts_trace.h>
// Add new monitors based on CONFIG_DA_MON_EVENTS_IMPLICIT here
#endif /* CONFIG_DA_MON_EVENTS_IMPLICIT */
diff --git a/tools/verification/models/sched/sts.dot b/tools/verification/models/sched/sts.dot
new file mode 100644
index 000000000000..8152675b4f52
--- /dev/null
+++ b/tools/verification/models/sched/sts.dot
@@ -0,0 +1,29 @@
+digraph state_automaton {
+ center = true;
+ size = "7,11";
+ {node [shape = circle] "disable_to_switch"};
+ {node [shape = circle] "enable_to_exit"};
+ {node [shape = circle] "scheduling"};
+ {node [shape = circle] "switching"};
+ {node [shape = plaintext, style=invis, label=""] "__init_thread"};
+ {node [shape = doublecircle] "thread"};
+ {node [shape = circle] "thread"};
+ "__init_thread" -> "thread";
+ "disable_to_switch" [label = "disable_to_switch"];
+ "disable_to_switch" -> "scheduling" [ label = "irq_enable" ];
+ "disable_to_switch" -> "switching" [ label = "sched_switch\nsched_switch_vain" ];
+ "enable_to_exit" [label = "enable_to_exit"];
+ "enable_to_exit" -> "enable_to_exit" [ label = "irq_disable\nirq_enable" ];
+ "enable_to_exit" -> "thread" [ label = "schedule_exit" ];
+ "scheduling" [label = "scheduling"];
+ "scheduling" -> "disable_to_switch" [ label = "irq_disable" ];
+ "switching" [label = "switching"];
+ "switching" -> "enable_to_exit" [ label = "irq_enable" ];
+ "thread" [label = "thread", color = green3];
+ "thread" -> "scheduling" [ label = "schedule_entry" ];
+ "thread" -> "thread" [ label = "irq_disable\nirq_enable" ];
+ { rank = min ;
+ "__init_thread";
+ "thread";
+ }
+}
diff --git a/tools/verification/models/sched/tss.dot b/tools/verification/models/sched/tss.dot
deleted file mode 100644
index 7dfa1d9121bb..000000000000
--- a/tools/verification/models/sched/tss.dot
+++ /dev/null
@@ -1,18 +0,0 @@
-digraph state_automaton {
- center = true;
- size = "7,11";
- {node [shape = plaintext] "sched"};
- {node [shape = plaintext, style=invis, label=""] "__init_thread"};
- {node [shape = ellipse] "thread"};
- {node [shape = plaintext] "thread"};
- "__init_thread" -> "thread";
- "sched" [label = "sched"];
- "sched" -> "sched" [ label = "sched_switch" ];
- "sched" -> "thread" [ label = "schedule_exit" ];
- "thread" [label = "thread", color = green3];
- "thread" -> "sched" [ label = "schedule_entry" ];
- { rank = min ;
- "__init_thread";
- "thread";
- }
-}
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (8 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-19 9:06 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 11/12] rv: Add nrp and sssw per-task monitors Gabriele Monaco
` (2 subsequent siblings)
12 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, linux-trace-kernel
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
DA monitor can be accessed from multiple cores simultaneously, this is
likely, for instance when dealing with per-task monitors reacting on
events that do not always occur on the CPU where the task is running.
This can cause race conditions where two events change the next state
and we see inconsistent values. E.g.:
[62] event_srs: 27: sleepable x sched_wakeup -> running (final)
[63] event_srs: 27: sleepable x sched_set_state_sleepable -> sleepable
[63] error_srs: 27: event sched_switch_suspend not expected in the state running
In this case the monitor fails because the event on CPU 62 wins against
the one on CPU 63, although the correct state should have been
sleepable, since the task get suspended.
Detect if the current state was modified by using try_cmpxchg while
storing the next value. If it was, try again reading the current state.
After a maximum number of failed retries, react as if it was an error
with invalid current state (we cannot determine it).
Monitors where this type of condition can occur must be able to account
for racing events in any possible order, as we cannot know the winner.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
include/linux/rv.h | 3 ++-
include/rv/da_monitor.h | 53 +++++++++++++++++++++++++++++++----------
2 files changed, 42 insertions(+), 14 deletions(-)
diff --git a/include/linux/rv.h b/include/linux/rv.h
index 3452b5e4b29e..a83a81ac6e46 100644
--- a/include/linux/rv.h
+++ b/include/linux/rv.h
@@ -7,7 +7,8 @@
#ifndef _LINUX_RV_H
#define _LINUX_RV_H
-#define MAX_DA_NAME_LEN 32
+#define MAX_DA_NAME_LEN 32
+#define MAX_DA_RETRY_RACING_EVENTS 3
#ifdef CONFIG_RV
/*
diff --git a/include/rv/da_monitor.h b/include/rv/da_monitor.h
index 215c3eb770cc..8b714e3085a5 100644
--- a/include/rv/da_monitor.h
+++ b/include/rv/da_monitor.h
@@ -82,16 +82,19 @@ static inline void da_monitor_reset_##name(struct da_monitor *da_mon) \
*/ \
static inline type da_monitor_curr_state_##name(struct da_monitor *da_mon) \
{ \
- return da_mon->curr_state; \
+ return READ_ONCE(da_mon->curr_state); \
} \
\
/* \
* da_monitor_set_state_##name - set the new current state \
+ * \
+ * return false without the change in case the state was modified elsewhere \
*/ \
-static inline void \
-da_monitor_set_state_##name(struct da_monitor *da_mon, enum states_##name state) \
+static inline bool \
+da_monitor_set_state_##name(struct da_monitor *da_mon, enum states_##name prev_state, \
+ enum states_##name state) \
{ \
- da_mon->curr_state = state; \
+ return try_cmpxchg(&da_mon->curr_state, &prev_state, state); \
} \
\
/* \
@@ -150,17 +153,29 @@ static inline bool da_monitor_handling_event_##name(struct da_monitor *da_mon)
* Event handler for implicit monitors. Implicit monitor is the one which the
* handler does not need to specify which da_monitor to manipulate. Examples
* of implicit monitor are the per_cpu or the global ones.
+ *
+ * Retry, in case there is a race while getting and setting the next state
+ * return an invalid current state if we run out of retries. The monitor should
+ * be able to handle various orders.
*/
#define DECLARE_DA_MON_MODEL_HANDLER_IMPLICIT(name, type) \
\
static inline bool \
da_event_##name(struct da_monitor *da_mon, enum events_##name event) \
{ \
- type curr_state = da_monitor_curr_state_##name(da_mon); \
- type next_state = model_get_next_state_##name(curr_state, event); \
+ bool changed; \
+ type curr_state, next_state; \
\
- if (next_state != INVALID_STATE) { \
- da_monitor_set_state_##name(da_mon, next_state); \
+ for (int i = 0; i < MAX_DA_RETRY_RACING_EVENTS; i++) { \
+ curr_state = da_monitor_curr_state_##name(da_mon); \
+ next_state = model_get_next_state_##name(curr_state, event); \
+ if (next_state == INVALID_STATE) \
+ break; \
+ changed = da_monitor_set_state_##name(da_mon, curr_state, next_state); \
+ if (unlikely(!changed)) { \
+ curr_state = -1; \
+ continue; \
+ } \
\
trace_event_##name(model_get_state_name_##name(curr_state), \
model_get_event_name_##name(event), \
@@ -181,17 +196,29 @@ da_event_##name(struct da_monitor *da_mon, enum events_##name event) \
/*
* Event handler for per_task monitors.
+ *
+ * Retry, in case there is a race while getting and setting the next state
+ * return an invalid current state if we run out of retries. The monitor should
+ * be able to handle various orders.
*/
#define DECLARE_DA_MON_MODEL_HANDLER_PER_TASK(name, type) \
\
static inline bool da_event_##name(struct da_monitor *da_mon, struct task_struct *tsk, \
enum events_##name event) \
{ \
- type curr_state = da_monitor_curr_state_##name(da_mon); \
- type next_state = model_get_next_state_##name(curr_state, event); \
- \
- if (next_state != INVALID_STATE) { \
- da_monitor_set_state_##name(da_mon, next_state); \
+ bool changed; \
+ type curr_state, next_state; \
+ \
+ for (int i = 0; i < MAX_DA_RETRY_RACING_EVENTS; i++) { \
+ curr_state = da_monitor_curr_state_##name(da_mon); \
+ next_state = model_get_next_state_##name(curr_state, event); \
+ if (next_state == INVALID_STATE) \
+ break; \
+ changed = da_monitor_set_state_##name(da_mon, curr_state, next_state); \
+ if (unlikely(!changed)) { \
+ curr_state = -1; \
+ continue; \
+ } \
\
trace_event_##name(tsk->pid, \
model_get_state_name_##name(curr_state), \
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 11/12] rv: Add nrp and sssw per-task monitors
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (9 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor Gabriele Monaco
2025-05-21 7:15 ` [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Nam Cao
12 siblings, 0 replies; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
Add 2 per-task monitors as part of the sched model:
* nrp: need-resched preempts
Monitor to ensure preemption requires need resched.
* sssw: set state sleep and wakeup
Monitor to ensure sched_set_state to sleepable leads to sleeping and
sleeping tasks require wakeup.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
Documentation/trace/rv/monitor_sched.rst | 117 ++++++++++++++++++
include/linux/rv.h | 2 +-
kernel/trace/rv/Kconfig | 2 +
kernel/trace/rv/Makefile | 2 +
kernel/trace/rv/monitors/nrp/Kconfig | 14 +++
kernel/trace/rv/monitors/nrp/nrp.c | 135 +++++++++++++++++++++
kernel/trace/rv/monitors/nrp/nrp.h | 51 ++++++++
kernel/trace/rv/monitors/nrp/nrp_trace.h | 15 +++
kernel/trace/rv/monitors/sssw/Kconfig | 15 +++
kernel/trace/rv/monitors/sssw/sssw.c | 115 ++++++++++++++++++
kernel/trace/rv/monitors/sssw/sssw.h | 64 ++++++++++
kernel/trace/rv/monitors/sssw/sssw_trace.h | 15 +++
kernel/trace/rv/rv_trace.h | 2 +
tools/verification/models/sched/nrp.dot | 19 +++
tools/verification/models/sched/sssw.dot | 24 ++++
15 files changed, 591 insertions(+), 1 deletion(-)
create mode 100644 kernel/trace/rv/monitors/nrp/Kconfig
create mode 100644 kernel/trace/rv/monitors/nrp/nrp.c
create mode 100644 kernel/trace/rv/monitors/nrp/nrp.h
create mode 100644 kernel/trace/rv/monitors/nrp/nrp_trace.h
create mode 100644 kernel/trace/rv/monitors/sssw/Kconfig
create mode 100644 kernel/trace/rv/monitors/sssw/sssw.c
create mode 100644 kernel/trace/rv/monitors/sssw/sssw.h
create mode 100644 kernel/trace/rv/monitors/sssw/sssw_trace.h
create mode 100644 tools/verification/models/sched/nrp.dot
create mode 100644 tools/verification/models/sched/sssw.dot
diff --git a/Documentation/trace/rv/monitor_sched.rst b/Documentation/trace/rv/monitor_sched.rst
index e4171aadef1c..97f0f1a10f43 100644
--- a/Documentation/trace/rv/monitor_sched.rst
+++ b/Documentation/trace/rv/monitor_sched.rst
@@ -198,6 +198,123 @@ real task switch occurs::
+-------------- | | <-------------+
+--------------------+
+Monitor nrp
+-----------
+
+The need resched preempts (nrp) monitor ensures preemption requires need
+resched. A preemption is any of the following types of ``sched_switch``:
+
+* ``sched_switch_preempt``:
+ a task is ``preempted``, this can happen after the need for ``rescheduling``
+ has been set, also in its ``lazy`` flavour (which doesn't make a different in
+ this monitor). ``need_resched`` can be set as a flag to the task or in the
+ per-core preemption count, either of them can trigger a preemption.
+* ``sched_switch_vain_preempt``:
+ a task goes through the scheduler from a preemption context but it is picked
+ as the next task to run, hence no real task switch occurs. Since we run the
+ scheduler, this clears the need to reschedule.
+
+This monitor ignores when the task is switched in, as this complicates things
+when different types of ``sched_switch`` occur (e.g. sleeping or yielding, here
+marked as ``sched_switch_other`` or ``sched_switch_vain``). The snroc monitor
+ensures a task is switched in before it can be switched out again.
+For this reason, the ``any_thread_running`` state does not imply that the
+monitored task is not running, simply it is not set for rescheduling::
+
+ |
+ |
+ v
+ sched_switch_other #=====================#
+ sched_switch_vain H H
+ +--------------------- H any_thread_running H
+ | H H
+ +--------------------> H H <+
+ #=====================# |
+ | |
+ | | sched_switch_preempt
+ | | sched_switch_vain_preempt
+ | sched_need_resched | sched_switch_other
+ | | sched_switch_vain
+ v |
+ sched_need_resched +---------------------+ |
+ +--------------------- | | |
+ | | rescheduling | |
+ +--------------------> | | -+
+ +---------------------+
+
+Monitor sssw
+------------
+
+The set state sleep and wakeup (sssw) monitor ensures ``sched_set_state`` to
+sleepable leads to sleeping and sleeping tasks require wakeup.
+It includes the following types of ``sched_switch``:
+
+* ``switch_suspend``:
+ a task puts itself to sleep, this can happen only after explicitly setting
+ the task to ``sleepable``. After a task is suspended, it needs to be woken up
+ (``waking`` state) before being switched in again.
+ Setting the task's state to ``sleepable`` can be reverted before switching if it
+ is woken up or set to ``runnable``.
+* ``switch_blocked``:
+ a special case of a ``switch_suspend`` where the task is waiting on a
+ sleeping RT lock (``PREEMPT_RT`` only), it is common to see wakeup and set
+ state events racing with each other and this leads the model to perceive this
+ type of switch when the task is not set to sleepable. This is a limitation of
+ the model in SMP system and workarounds may slow down the system.
+* ``switch_yield``:
+ a task explicitly calls the scheduler, this looks like a preemption as the
+ task is still runnable but the ``need_resched`` flag is not set. It can
+ happen after a ``yield`` system call or from the idle task. By definition,
+ a task cannot yield while ``sleepable`` as that would be a suspension.
+* ``switch_vain``:
+ a task explicitly calls the scheduler but it is picked as the next task to run,
+ hence no real task switch occurs. This can occur as a yield, which is not
+ valid when the task is sleepable. A special case of a yield is when a task in
+ ``TASK_INTERRUPTIBLE`` calls the scheduler while a signal is pending. The
+ task doesn't go through the usual blocking/waking and is set back to
+ runnable, the resulting switch looks like a yield.
+
+As for the nrp monitor, this monitor doesn't include a running state,
+``sleepable`` and ``runnable`` are only referring to the task's desired
+state, which could be scheduled out (e.g. due to preemption). However, it does
+include the event ``sched_switch_in`` to represent when a task is allowed to
+become running. This can be triggered also by preemption, but cannot occur
+after the task got to ``sleeping`` until a ``wakeup``::
+
+ sched_set_state_runnable
+ sched_wakeup
+ sched_switch_vain_preempt |
+ sched_switch_preempt |
+ sched_switch_yield v
+ sched_switch_vain #=============================================#
+ +-----------------> H H
+ | H H
+ +------------------ H runnable H
+ H H
+ +-----------------> H H
+ | #=============================================#
+ sched_set_state_runnable | | ^
+ sched_wakeup | sched_switch_blocking |
+ | sched_set_state_sleepable | |
+ | v | |
+ | +------------------------+ | sched_wakeup
+ +------------------ | | | |
+ | | | |
+ +-----------------> | sleepable | | |
+ | | | | |
+ +------------------ | | | |
+ sched_switch_in +------------------------+ | |
+ sched_switch_preempt | | |
+ sched_switch_vain_preempt sched_switch_suspend | |
+ sched_set_state_sleepable sched_switch_blocking | |
+ | | |
+ v | |
+ +------------------------+ | |
+ | sleeping | <-+ |
+ +------------------------+ |
+ | |
+ +----------------------------+
+
References
----------
diff --git a/include/linux/rv.h b/include/linux/rv.h
index a83a81ac6e46..20c89f5b9a5b 100644
--- a/include/linux/rv.h
+++ b/include/linux/rv.h
@@ -25,7 +25,7 @@ struct da_monitor {
* adding more or developing a dynamic method. So far, none of
* these are justified.
*/
-#define RV_PER_TASK_MONITORS 1
+#define RV_PER_TASK_MONITORS 3
#define RV_PER_TASK_MONITOR_INIT (RV_PER_TASK_MONITORS)
/*
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index a53a3eca9616..f106cf7b2fd3 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -34,6 +34,8 @@ source "kernel/trace/rv/monitors/scpd/Kconfig"
source "kernel/trace/rv/monitors/snep/Kconfig"
source "kernel/trace/rv/monitors/sncid/Kconfig"
source "kernel/trace/rv/monitors/sts/Kconfig"
+source "kernel/trace/rv/monitors/nrp/Kconfig"
+source "kernel/trace/rv/monitors/sssw/Kconfig"
# Add new monitors here
config RV_REACTORS
diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
index c609b72275cb..c076cf48af18 100644
--- a/kernel/trace/rv/Makefile
+++ b/kernel/trace/rv/Makefile
@@ -12,6 +12,8 @@ obj-$(CONFIG_RV_MON_SCPD) += monitors/scpd/scpd.o
obj-$(CONFIG_RV_MON_SNEP) += monitors/snep/snep.o
obj-$(CONFIG_RV_MON_SNCID) += monitors/sncid/sncid.o
obj-$(CONFIG_RV_MON_STS) += monitors/sts/sts.o
+obj-$(CONFIG_RV_MON_NRP) += monitors/nrp/nrp.o
+obj-$(CONFIG_RV_MON_SSSW) += monitors/sssw/sssw.o
# Add new monitors here
obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
diff --git a/kernel/trace/rv/monitors/nrp/Kconfig b/kernel/trace/rv/monitors/nrp/Kconfig
new file mode 100644
index 000000000000..726f34259a6d
--- /dev/null
+++ b/kernel/trace/rv/monitors/nrp/Kconfig
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_NRP
+ depends on RV
+ depends on RV_MON_SCHED
+ default y
+ select DA_MON_EVENTS_ID
+ bool "nrp monitor"
+ help
+ Monitor to ensure preemption requires need resched.
+ This monitor is part of the sched monitors collection.
+
+ For further information, see:
+ Documentation/trace/rv/monitor_sched.rst
diff --git a/kernel/trace/rv/monitors/nrp/nrp.c b/kernel/trace/rv/monitors/nrp/nrp.c
new file mode 100644
index 000000000000..9349026bb60b
--- /dev/null
+++ b/kernel/trace/rv/monitors/nrp/nrp.c
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+#include <rv/da_monitor.h>
+
+#define MODULE_NAME "nrp"
+
+#include <trace/events/sched.h>
+#include <rv_trace.h>
+#include <monitors/sched/sched.h>
+
+#include "nrp.h"
+
+static struct rv_monitor rv_nrp;
+DECLARE_DA_MON_PER_TASK(nrp, unsigned char);
+
+static void handle_sched_need_resched(void *data, struct task_struct *tsk,
+ int cpu, int tif)
+{
+ da_handle_event_nrp(tsk, sched_need_resched_nrp);
+}
+
+static void handle_schedule_entry(void *data, bool preempt, unsigned long ip)
+{
+ /*
+ * In theory, a preemption can only occur after the need_resched flag
+ * is set. In practice, however, we may see a preemption where the flag
+ * is not set. This can happen in one specific condition:
+ *
+ * need_resched
+ * preempt_schedule()
+ * preempt_schedule_irq()
+ * __schedule()
+ * !need_resched
+ * __schedule()
+ *
+ * In the situation above, we start a standard preemption (e.g. from
+ * preempt_enable when the flag is set), an interrupts occurs before we
+ * schedule and, on its exit path, it schedules, which clears the
+ * need_resched flag.
+ * When the preempted task runs again, we continue the standard
+ * preemption started earlier, although the flag is no longer set.
+ *
+ * The following workaround allows the model not to fail in this
+ * condition, but makes it weaker. In fact, we are not proving that:
+ * 1. we don't miss any event setting need_resched
+ * 2. we don't preempt when not required
+ *
+ * Ideally, we should find a way to narrow down the condition, however
+ * that's rather tricky without adding several tracepoints in
+ * undesired locations.
+ */
+ if (preempt && unlikely(!tif_need_resched()))
+ da_handle_event_nrp(current, sched_need_resched_nrp);
+}
+
+static void handle_sched_switch(void *data, bool preempt,
+ struct task_struct *prev,
+ struct task_struct *next,
+ unsigned int prev_state)
+{
+ if (preempt)
+ da_handle_start_event_nrp(prev, sched_switch_preempt_nrp);
+ else
+ da_handle_start_event_nrp(prev, sched_switch_other_nrp);
+}
+
+static void handle_sched_switch_vain(void *data, bool preempt,
+ struct task_struct *tsk,
+ unsigned int tsk_state)
+{
+ if (preempt)
+ da_handle_start_event_nrp(tsk, sched_switch_vain_preempt_nrp);
+ else
+ da_handle_start_event_nrp(tsk, sched_switch_vain_nrp);
+}
+
+static int enable_nrp(void)
+{
+ int retval;
+
+ retval = da_monitor_init_nrp();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("snep", sched_entry_tp, handle_schedule_entry);
+ rv_attach_trace_probe("nrp", sched_set_need_resched_tp, handle_sched_need_resched);
+ rv_attach_trace_probe("nrp", sched_switch, handle_sched_switch);
+ rv_attach_trace_probe("nrp", sched_switch_vain_tp, handle_sched_switch_vain);
+
+ return 0;
+}
+
+static void disable_nrp(void)
+{
+ rv_nrp.enabled = 0;
+
+ rv_detach_trace_probe("snep", sched_entry_tp, handle_schedule_entry);
+ rv_detach_trace_probe("nrp", sched_set_need_resched_tp, handle_sched_need_resched);
+ rv_detach_trace_probe("nrp", sched_switch, handle_sched_switch);
+ rv_detach_trace_probe("nrp", sched_switch_vain_tp, handle_sched_switch_vain);
+
+ da_monitor_destroy_nrp();
+}
+
+static struct rv_monitor rv_nrp = {
+ .name = "nrp",
+ .description = "need resched preempts.",
+ .enable = enable_nrp,
+ .disable = disable_nrp,
+ .reset = da_monitor_reset_all_nrp,
+ .enabled = 0,
+};
+
+static int __init register_nrp(void)
+{
+ return rv_register_monitor(&rv_nrp, &rv_sched);
+}
+
+static void __exit unregister_nrp(void)
+{
+ rv_unregister_monitor(&rv_nrp);
+}
+
+module_init(register_nrp);
+module_exit(unregister_nrp);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
+MODULE_DESCRIPTION("nrp: need resched preempts.");
diff --git a/kernel/trace/rv/monitors/nrp/nrp.h b/kernel/trace/rv/monitors/nrp/nrp.h
new file mode 100644
index 000000000000..51c0e23da72a
--- /dev/null
+++ b/kernel/trace/rv/monitors/nrp/nrp.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of nrp automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+enum states_nrp {
+ any_thread_running_nrp = 0,
+ rescheduling_nrp,
+ state_max_nrp
+};
+
+#define INVALID_STATE state_max_nrp
+
+enum events_nrp {
+ sched_need_resched_nrp = 0,
+ sched_switch_other_nrp,
+ sched_switch_preempt_nrp,
+ sched_switch_vain_nrp,
+ sched_switch_vain_preempt_nrp,
+ event_max_nrp
+};
+
+struct automaton_nrp {
+ char *state_names[state_max_nrp];
+ char *event_names[event_max_nrp];
+ unsigned char function[state_max_nrp][event_max_nrp];
+ unsigned char initial_state;
+ bool final_states[state_max_nrp];
+};
+
+static const struct automaton_nrp automaton_nrp = {
+ .state_names = {
+ "any_thread_running",
+ "rescheduling"
+ },
+ .event_names = {
+ "sched_need_resched",
+ "sched_switch_other",
+ "sched_switch_preempt",
+ "sched_switch_vain",
+ "sched_switch_vain_preempt"
+ },
+ .function = {
+ { rescheduling_nrp, any_thread_running_nrp, INVALID_STATE, any_thread_running_nrp, INVALID_STATE },
+ { rescheduling_nrp, any_thread_running_nrp, any_thread_running_nrp, any_thread_running_nrp, any_thread_running_nrp },
+ },
+ .initial_state = any_thread_running_nrp,
+ .final_states = { 1, 0 },
+};
diff --git a/kernel/trace/rv/monitors/nrp/nrp_trace.h b/kernel/trace/rv/monitors/nrp/nrp_trace.h
new file mode 100644
index 000000000000..2e13497de3b6
--- /dev/null
+++ b/kernel/trace/rv/monitors/nrp/nrp_trace.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_NRP
+DEFINE_EVENT(event_da_monitor_id, event_nrp,
+ TP_PROTO(int id, char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(id, state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor_id, error_nrp,
+ TP_PROTO(int id, char *state, char *event),
+ TP_ARGS(id, state, event));
+#endif /* CONFIG_RV_MON_NRP */
diff --git a/kernel/trace/rv/monitors/sssw/Kconfig b/kernel/trace/rv/monitors/sssw/Kconfig
new file mode 100644
index 000000000000..23b7eeb38bbf
--- /dev/null
+++ b/kernel/trace/rv/monitors/sssw/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_SSSW
+ depends on RV
+ depends on RV_MON_SCHED
+ default y
+ select DA_MON_EVENTS_ID
+ bool "sssw monitor"
+ help
+ Monitor to ensure sched_set_state to sleepable leads to sleeping and
+ sleeping tasks require wakeup.
+ This monitor is part of the sched monitors collection.
+
+ For further information, see:
+ Documentation/trace/rv/monitor_sched.rst
diff --git a/kernel/trace/rv/monitors/sssw/sssw.c b/kernel/trace/rv/monitors/sssw/sssw.c
new file mode 100644
index 000000000000..efa695dead36
--- /dev/null
+++ b/kernel/trace/rv/monitors/sssw/sssw.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+#include <rv/da_monitor.h>
+
+#define MODULE_NAME "sssw"
+
+#include <trace/events/sched.h>
+#include <rv_trace.h>
+#include <monitors/sched/sched.h>
+
+#include "sssw.h"
+
+static struct rv_monitor rv_sssw;
+DECLARE_DA_MON_PER_TASK(sssw, unsigned char);
+
+static void handle_sched_set_state(void *data, struct task_struct *tsk,
+ int state, bool from_signal)
+{
+ if (state == TASK_RUNNING)
+ da_handle_start_event_sssw(tsk, sched_set_state_runnable_sssw);
+ else
+ da_handle_event_sssw(tsk, sched_set_state_sleepable_sssw);
+}
+
+static void handle_sched_switch(void *data, bool preempt,
+ struct task_struct *prev,
+ struct task_struct *next,
+ unsigned int prev_state)
+{
+ if (preempt)
+ da_handle_event_sssw(prev, sched_switch_preempt_sssw);
+ else if (prev_state == TASK_RUNNING)
+ da_handle_start_event_sssw(prev, sched_switch_yield_sssw);
+ else if (prev_state == TASK_RTLOCK_WAIT)
+ /* special case of sleeping task with racy conditions */
+ da_handle_event_sssw(prev, sched_switch_blocking_sssw);
+ else
+ da_handle_event_sssw(prev, sched_switch_suspend_sssw);
+ da_handle_event_sssw(next, sched_switch_in_sssw);
+}
+
+static void handle_sched_switch_vain(void *data, bool preempt,
+ struct task_struct *tsk,
+ unsigned int tsk_state)
+{
+ if (preempt)
+ da_handle_event_sssw(tsk, sched_switch_vain_preempt_sssw);
+ else
+ da_handle_start_event_sssw(tsk, sched_switch_vain_sssw);
+}
+
+static void handle_sched_wakeup(void *data, struct task_struct *p)
+{
+ da_handle_start_event_sssw(p, sched_wakeup_sssw);
+}
+
+static int enable_sssw(void)
+{
+ int retval;
+
+ retval = da_monitor_init_sssw();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("sssw", sched_set_state_tp, handle_sched_set_state);
+ rv_attach_trace_probe("sssw", sched_switch, handle_sched_switch);
+ rv_attach_trace_probe("sssw", sched_switch_vain_tp, handle_sched_switch_vain);
+ rv_attach_trace_probe("sssw", sched_wakeup, handle_sched_wakeup);
+
+ return 0;
+}
+
+static void disable_sssw(void)
+{
+ rv_sssw.enabled = 0;
+
+ rv_detach_trace_probe("sssw", sched_set_state_tp, handle_sched_set_state);
+ rv_detach_trace_probe("sssw", sched_switch, handle_sched_switch);
+ rv_detach_trace_probe("sssw", sched_switch_vain_tp, handle_sched_switch_vain);
+ rv_detach_trace_probe("sssw", sched_wakeup, handle_sched_wakeup);
+
+ da_monitor_destroy_sssw();
+}
+
+static struct rv_monitor rv_sssw = {
+ .name = "sssw",
+ .description = "set state sleep and wakeup.",
+ .enable = enable_sssw,
+ .disable = disable_sssw,
+ .reset = da_monitor_reset_all_sssw,
+ .enabled = 0,
+};
+
+static int __init register_sssw(void)
+{
+ return rv_register_monitor(&rv_sssw, &rv_sched);
+}
+
+static void __exit unregister_sssw(void)
+{
+ rv_unregister_monitor(&rv_sssw);
+}
+
+module_init(register_sssw);
+module_exit(unregister_sssw);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
+MODULE_DESCRIPTION("sssw: set state sleep and wakeup.");
diff --git a/kernel/trace/rv/monitors/sssw/sssw.h b/kernel/trace/rv/monitors/sssw/sssw.h
new file mode 100644
index 000000000000..ae95a03bf2f3
--- /dev/null
+++ b/kernel/trace/rv/monitors/sssw/sssw.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of sssw automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+enum states_sssw {
+ runnanble_sssw = 0,
+ sleepable_sssw,
+ sleeping_sssw,
+ state_max_sssw
+};
+
+#define INVALID_STATE state_max_sssw
+
+enum events_sssw {
+ sched_set_state_runnable_sssw = 0,
+ sched_set_state_sleepable_sssw,
+ sched_switch_blocking_sssw,
+ sched_switch_in_sssw,
+ sched_switch_preempt_sssw,
+ sched_switch_suspend_sssw,
+ sched_switch_vain_sssw,
+ sched_switch_vain_preempt_sssw,
+ sched_switch_yield_sssw,
+ sched_wakeup_sssw,
+ event_max_sssw
+};
+
+struct automaton_sssw {
+ char *state_names[state_max_sssw];
+ char *event_names[event_max_sssw];
+ unsigned char function[state_max_sssw][event_max_sssw];
+ unsigned char initial_state;
+ bool final_states[state_max_sssw];
+};
+
+static const struct automaton_sssw automaton_sssw = {
+ .state_names = {
+ "runnanble",
+ "sleepable",
+ "sleeping"
+ },
+ .event_names = {
+ "sched_set_state_runnable",
+ "sched_set_state_sleepable",
+ "sched_switch_blocking",
+ "sched_switch_in",
+ "sched_switch_preempt",
+ "sched_switch_suspend",
+ "sched_switch_vain",
+ "sched_switch_vain_preempt",
+ "sched_switch_yield",
+ "sched_wakeup"
+ },
+ .function = {
+ { runnanble_sssw, sleepable_sssw, sleeping_sssw, runnanble_sssw, runnanble_sssw, INVALID_STATE, runnanble_sssw, runnanble_sssw, runnanble_sssw, runnanble_sssw },
+ { runnanble_sssw, sleepable_sssw, sleeping_sssw, sleepable_sssw, sleepable_sssw, sleeping_sssw, INVALID_STATE, sleepable_sssw, INVALID_STATE, runnanble_sssw },
+ { INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE, INVALID_STATE, runnanble_sssw },
+ },
+ .initial_state = runnanble_sssw,
+ .final_states = { 1, 0, 0 },
+};
diff --git a/kernel/trace/rv/monitors/sssw/sssw_trace.h b/kernel/trace/rv/monitors/sssw/sssw_trace.h
new file mode 100644
index 000000000000..6c03cfc6960b
--- /dev/null
+++ b/kernel/trace/rv/monitors/sssw/sssw_trace.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_SSSW
+DEFINE_EVENT(event_da_monitor_id, event_sssw,
+ TP_PROTO(int id, char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(id, state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor_id, error_sssw,
+ TP_PROTO(int id, char *state, char *event),
+ TP_ARGS(id, state, event));
+#endif /* CONFIG_RV_MON_SSSW */
diff --git a/kernel/trace/rv/rv_trace.h b/kernel/trace/rv/rv_trace.h
index a5e1c52e2992..d12ab74dcabc 100644
--- a/kernel/trace/rv/rv_trace.h
+++ b/kernel/trace/rv/rv_trace.h
@@ -124,6 +124,8 @@ DECLARE_EVENT_CLASS(error_da_monitor_id,
#include <monitors/wwnr/wwnr_trace.h>
#include <monitors/snroc/snroc_trace.h>
+#include <monitors/nrp/nrp_trace.h>
+#include <monitors/sssw/sssw_trace.h>
// Add new monitors based on CONFIG_DA_MON_EVENTS_ID here
#endif /* CONFIG_DA_MON_EVENTS_ID */
diff --git a/tools/verification/models/sched/nrp.dot b/tools/verification/models/sched/nrp.dot
new file mode 100644
index 000000000000..6aa7604e98fc
--- /dev/null
+++ b/tools/verification/models/sched/nrp.dot
@@ -0,0 +1,19 @@
+digraph state_automaton {
+ center = true;
+ size = "7,11";
+ {node [shape = plaintext, style=invis, label=""] "__init_any_thread_running"};
+ {node [shape = doublecircle] "any_thread_running"};
+ {node [shape = circle] "any_thread_running"};
+ {node [shape = circle] "rescheduling"};
+ "__init_any_thread_running" -> "any_thread_running";
+ "any_thread_running" [label = "any_thread_running", color = green3];
+ "any_thread_running" -> "any_thread_running" [ label = "sched_switch_other\nsched_switch_vain" ];
+ "any_thread_running" -> "rescheduling" [ label = "sched_need_resched" ];
+ "rescheduling" [label = "rescheduling"];
+ "rescheduling" -> "any_thread_running" [ label = "sched_switch_preempt\nsched_switch_vain_preempt\nsched_switch_other\nsched_switch_vain" ];
+ "rescheduling" -> "rescheduling" [ label = "sched_need_resched" ];
+ { rank = min ;
+ "__init_any_thread_running";
+ "any_thread_running";
+ }
+}
diff --git a/tools/verification/models/sched/sssw.dot b/tools/verification/models/sched/sssw.dot
new file mode 100644
index 000000000000..362e783f2bd5
--- /dev/null
+++ b/tools/verification/models/sched/sssw.dot
@@ -0,0 +1,24 @@
+digraph state_automaton {
+ center = true;
+ size = "7,11";
+ {node [shape = plaintext, style=invis, label=""] "__init_runnanble"};
+ {node [shape = doublecircle] "runnanble"};
+ {node [shape = circle] "runnanble"};
+ {node [shape = circle] "sleepable"};
+ {node [shape = circle] "sleeping"};
+ "__init_runnanble" -> "runnanble";
+ "runnanble" [label = "runnanble", color = green3];
+ "runnanble" -> "runnanble" [ label = "sched_set_state_runnable\nsched_wakeup\nsched_switch_in\nsched_switch_vain\nsched_switch_yield\nsched_switch_preempt\nsched_switch_vain_preempt" ];
+ "runnanble" -> "sleepable" [ label = "sched_set_state_sleepable" ];
+ "runnanble" -> "sleeping" [ label = "sched_switch_blocking" ];
+ "sleepable" [label = "sleepable"];
+ "sleepable" -> "runnanble" [ label = "sched_set_state_runnable\nsched_wakeup" ];
+ "sleepable" -> "sleepable" [ label = "sched_set_state_sleepable\nsched_switch_in\nsched_switch_preempt\nsched_switch_vain_preempt" ];
+ "sleepable" -> "sleeping" [ label = "sched_switch_suspend\nsched_switch_blocking" ];
+ "sleeping" [label = "sleeping"];
+ "sleeping" -> "runnanble" [ label = "sched_wakeup" ];
+ { rank = min ;
+ "__init_runnanble";
+ "runnanble";
+ }
+}
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (10 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 11/12] rv: Add nrp and sssw per-task monitors Gabriele Monaco
@ 2025-05-14 8:43 ` Gabriele Monaco
2025-05-27 13:37 ` Nam Cao
2025-05-21 7:15 ` [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Nam Cao
12 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-14 8:43 UTC (permalink / raw)
To: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc
Cc: Gabriele Monaco, Ingo Molnar, Peter Zijlstra, Nam Cao,
Tomas Glozar, Juri Lelli
Add a per-cpu monitor as part of the sched model:
* opid: operations with preemption and irq disabled
Monitor to ensure wakeup and need_resched occur with irq and
preemption disabled or in irq handlers.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
Documentation/trace/rv/monitor_sched.rst | 48 +++++++
kernel/trace/rv/Kconfig | 1 +
kernel/trace/rv/Makefile | 1 +
kernel/trace/rv/monitors/opid/Kconfig | 17 +++
kernel/trace/rv/monitors/opid/opid.c | 151 +++++++++++++++++++++
kernel/trace/rv/monitors/opid/opid.h | 64 +++++++++
kernel/trace/rv/monitors/opid/opid_trace.h | 15 ++
kernel/trace/rv/rv_trace.h | 1 +
tools/verification/models/sched/opid.dot | 35 +++++
9 files changed, 333 insertions(+)
create mode 100644 kernel/trace/rv/monitors/opid/Kconfig
create mode 100644 kernel/trace/rv/monitors/opid/opid.c
create mode 100644 kernel/trace/rv/monitors/opid/opid.h
create mode 100644 kernel/trace/rv/monitors/opid/opid_trace.h
create mode 100644 tools/verification/models/sched/opid.dot
diff --git a/Documentation/trace/rv/monitor_sched.rst b/Documentation/trace/rv/monitor_sched.rst
index 97f0f1a10f43..f044bca7ac31 100644
--- a/Documentation/trace/rv/monitor_sched.rst
+++ b/Documentation/trace/rv/monitor_sched.rst
@@ -315,6 +315,54 @@ after the task got to ``sleeping`` until a ``wakeup``::
| |
+----------------------------+
+Monitor opid
+------------
+
+The operations with preemption and irq disabled (opid) monitor ensures
+operations like ``wakeup`` and ``need_resched`` occur with interrupts and
+preemption disabled or during IRQs, in such case preemption may not be disabled
+explicitly.
+``need_resched`` can be set by some RCU internals functions, in which case it
+doesn't match a task wakeup and might occur with only interrupts disabled::
+
+ | sched_need_resched
+ | sched_waking
+ | irq_entry
+ | +--------------------+
+ v v |
+ +------------------------------------------------------+
+ +----------- | disabled | <+
+ | +------------------------------------------------------+ |
+ | | ^ |
+ | | preempt_disable sched_need_resched |
+ | preempt_enable | +--------------------+ |
+ | v | v | |
+ | +------------------------------------------------------+ |
+ | | irq_disabled | |
+ | +------------------------------------------------------+ |
+ | | | ^ |
+ | irq_entry | | |
+ | sched_need_resched v | irq_disable |
+ | sched_waking +--------------+ | | |
+ | +----- | | irq_enable | |
+ | | | in_irq | | | |
+ | +----> | | | | |
+ | +--------------+ | | irq_disable
+ | | | | |
+ | irq_enable | irq_enable | | |
+ | v v | |
+ | #======================================================# |
+ | H enabled H |
+ | #======================================================# |
+ | | ^ ^ preempt_enable | |
+ | preempt_disable preempt_enable +--------------------+ |
+ | v | |
+ | +------------------+ | |
+ +----------> | preempt_disabled | -+ |
+ +------------------+ |
+ | |
+ +-------------------------------------------------------+
+
References
----------
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index f106cf7b2fd3..9ebb80931a9f 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -36,6 +36,7 @@ source "kernel/trace/rv/monitors/sncid/Kconfig"
source "kernel/trace/rv/monitors/sts/Kconfig"
source "kernel/trace/rv/monitors/nrp/Kconfig"
source "kernel/trace/rv/monitors/sssw/Kconfig"
+source "kernel/trace/rv/monitors/opid/Kconfig"
# Add new monitors here
config RV_REACTORS
diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
index c076cf48af18..0eca5e77d0d2 100644
--- a/kernel/trace/rv/Makefile
+++ b/kernel/trace/rv/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_RV_MON_SNCID) += monitors/sncid/sncid.o
obj-$(CONFIG_RV_MON_STS) += monitors/sts/sts.o
obj-$(CONFIG_RV_MON_NRP) += monitors/nrp/nrp.o
obj-$(CONFIG_RV_MON_SSSW) += monitors/sssw/sssw.o
+obj-$(CONFIG_RV_MON_OPID) += monitors/opid/opid.o
# Add new monitors here
obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
diff --git a/kernel/trace/rv/monitors/opid/Kconfig b/kernel/trace/rv/monitors/opid/Kconfig
new file mode 100644
index 000000000000..c59d51654cd1
--- /dev/null
+++ b/kernel/trace/rv/monitors/opid/Kconfig
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_OPID
+ depends on RV
+ depends on IRQSOFF_TRACER
+ depends on PREEMPT_TRACER
+ depends on RV_MON_SCHED
+ default y
+ select DA_MON_EVENTS_IMPLICIT
+ bool "opid monitor"
+ help
+ Monitor to ensure operations like wakeup and need resched occur with
+ interrupts and preemption disabled or during IRQs, where preemption
+ may not be disabled explicitly.
+
+ For further information, see:
+ Documentation/trace/rv/monitor_sched.rst
diff --git a/kernel/trace/rv/monitors/opid/opid.c b/kernel/trace/rv/monitors/opid/opid.c
new file mode 100644
index 000000000000..d8732d681753
--- /dev/null
+++ b/kernel/trace/rv/monitors/opid/opid.c
@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+#include <rv/da_monitor.h>
+
+#define MODULE_NAME "opid"
+
+#include <trace/events/sched.h>
+#include <trace/events/irq.h>
+#include <trace/events/preemptirq.h>
+#include <rv_trace.h>
+#include <monitors/sched/sched.h>
+
+#include "opid.h"
+
+static struct rv_monitor rv_opid;
+DECLARE_DA_MON_PER_CPU(opid, unsigned char);
+
+#ifdef CONFIG_X86_LOCAL_APIC
+#include <asm/trace/irq_vectors.h>
+
+static void handle_vector_irq_entry(void *data, int vector)
+{
+ da_handle_event_opid(irq_entry_opid);
+}
+
+static void attach_vector_irq(void) {
+ rv_attach_trace_probe("opid", local_timer_entry, handle_vector_irq_entry);
+}
+
+static void detach_vector_irq(void) {
+ rv_detach_trace_probe("opid", local_timer_entry, handle_vector_irq_entry);
+}
+
+#else
+/* We assume irq_entry tracepoints are sufficient on other architectures */
+static void attach_vector_irq() { }
+static void detach_vector_irq() { }
+#endif
+
+static void handle_irq_disable(void *data, unsigned long ip, unsigned long parent_ip)
+{
+ da_handle_event_opid(irq_disable_opid);
+}
+
+static void handle_irq_enable(void *data, unsigned long ip, unsigned long parent_ip)
+{
+ da_handle_event_opid(irq_enable_opid);
+}
+
+static void handle_irq_entry(void *data, int irq, struct irqaction *action)
+{
+ da_handle_event_opid(irq_entry_opid);
+}
+
+static void handle_preempt_disable(void *data, unsigned long ip, unsigned long parent_ip)
+{
+ da_handle_event_opid(preempt_disable_opid);
+}
+
+static void handle_preempt_enable(void *data, unsigned long ip, unsigned long parent_ip)
+{
+ da_handle_event_opid(preempt_enable_opid);
+}
+
+static void handle_sched_need_resched(void *data, struct task_struct *tsk, int cpu, int tif)
+{
+ if(in_irq())
+ da_handle_event_opid(sched_need_resched_opid);
+ else
+ da_handle_start_event_opid(sched_need_resched_opid);
+}
+
+static void handle_sched_waking(void *data, struct task_struct *p)
+{
+ if(in_irq())
+ da_handle_event_opid(sched_waking_opid);
+ else
+ da_handle_start_event_opid(sched_waking_opid);
+}
+
+static int enable_opid(void)
+{
+ int retval;
+
+ retval = da_monitor_init_opid();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("opid", irq_disable, handle_irq_disable);
+ rv_attach_trace_probe("opid", irq_enable, handle_irq_enable);
+ rv_attach_trace_probe("opid", irq_handler_entry, handle_irq_entry);
+ rv_attach_trace_probe("opid", preempt_disable, handle_preempt_disable);
+ rv_attach_trace_probe("opid", preempt_enable, handle_preempt_enable);
+ rv_attach_trace_probe("opid", sched_set_need_resched_tp, handle_sched_need_resched);
+ rv_attach_trace_probe("opid", sched_waking, handle_sched_waking);
+ attach_vector_irq();
+
+ return 0;
+}
+
+static void disable_opid(void)
+{
+ rv_opid.enabled = 0;
+
+ rv_detach_trace_probe("opid", irq_disable, handle_irq_disable);
+ rv_detach_trace_probe("opid", irq_enable, handle_irq_enable);
+ rv_detach_trace_probe("opid", irq_handler_entry, handle_irq_entry);
+ rv_detach_trace_probe("opid", preempt_disable, handle_preempt_disable);
+ rv_detach_trace_probe("opid", preempt_enable, handle_preempt_enable);
+ rv_detach_trace_probe("opid", sched_set_need_resched_tp, handle_sched_need_resched);
+ rv_detach_trace_probe("opid", sched_waking, handle_sched_waking);
+ detach_vector_irq();
+
+ da_monitor_destroy_opid();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_opid = {
+ .name = "opid",
+ .description = "operations with preemption and irq disabled.",
+ .enable = enable_opid,
+ .disable = disable_opid,
+ .reset = da_monitor_reset_all_opid,
+ .enabled = 0,
+};
+
+static int __init register_opid(void)
+{
+ rv_register_monitor(&rv_opid, &rv_sched);
+ return 0;
+}
+
+static void __exit unregister_opid(void)
+{
+ rv_unregister_monitor(&rv_opid);
+}
+
+module_init(register_opid);
+module_exit(unregister_opid);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
+MODULE_DESCRIPTION("opid: operations with preemption and irq disabled.");
diff --git a/kernel/trace/rv/monitors/opid/opid.h b/kernel/trace/rv/monitors/opid/opid.h
new file mode 100644
index 000000000000..4c6d4a3964c5
--- /dev/null
+++ b/kernel/trace/rv/monitors/opid/opid.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of opid automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+enum states_opid {
+ disabled_opid = 0,
+ enabled_opid,
+ in_irq_opid,
+ irq_disabled_opid,
+ preempt_disabled_opid,
+ state_max_opid
+};
+
+#define INVALID_STATE state_max_opid
+
+enum events_opid {
+ irq_disable_opid = 0,
+ irq_enable_opid,
+ irq_entry_opid,
+ preempt_disable_opid,
+ preempt_enable_opid,
+ sched_need_resched_opid,
+ sched_waking_opid,
+ event_max_opid
+};
+
+struct automaton_opid {
+ char *state_names[state_max_opid];
+ char *event_names[event_max_opid];
+ unsigned char function[state_max_opid][event_max_opid];
+ unsigned char initial_state;
+ bool final_states[state_max_opid];
+};
+
+static const struct automaton_opid automaton_opid = {
+ .state_names = {
+ "disabled",
+ "enabled",
+ "in_irq",
+ "irq_disabled",
+ "preempt_disabled"
+ },
+ .event_names = {
+ "irq_disable",
+ "irq_enable",
+ "irq_entry",
+ "preempt_disable",
+ "preempt_enable",
+ "sched_need_resched",
+ "sched_waking"
+ },
+ .function = {
+ { INVALID_STATE, preempt_disabled_opid, disabled_opid, INVALID_STATE, irq_disabled_opid, disabled_opid, disabled_opid },
+ { irq_disabled_opid, INVALID_STATE, INVALID_STATE, preempt_disabled_opid, enabled_opid, INVALID_STATE, INVALID_STATE },
+ { INVALID_STATE, enabled_opid, INVALID_STATE, INVALID_STATE, INVALID_STATE, in_irq_opid, in_irq_opid },
+ { INVALID_STATE, enabled_opid, in_irq_opid, disabled_opid, INVALID_STATE, irq_disabled_opid, INVALID_STATE },
+ { disabled_opid, INVALID_STATE, INVALID_STATE, INVALID_STATE, enabled_opid, INVALID_STATE, INVALID_STATE },
+ },
+ .initial_state = disabled_opid,
+ .final_states = { 0, 1, 0, 0, 0 },
+};
diff --git a/kernel/trace/rv/monitors/opid/opid_trace.h b/kernel/trace/rv/monitors/opid/opid_trace.h
new file mode 100644
index 000000000000..3df6ff955c30
--- /dev/null
+++ b/kernel/trace/rv/monitors/opid/opid_trace.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_OPID
+DEFINE_EVENT(event_da_monitor, event_opid,
+ TP_PROTO(char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor, error_opid,
+ TP_PROTO(char *state, char *event),
+ TP_ARGS(state, event));
+#endif /* CONFIG_RV_MON_OPID */
diff --git a/kernel/trace/rv/rv_trace.h b/kernel/trace/rv/rv_trace.h
index d12ab74dcabc..5d3c5c3f7545 100644
--- a/kernel/trace/rv/rv_trace.h
+++ b/kernel/trace/rv/rv_trace.h
@@ -63,6 +63,7 @@ DECLARE_EVENT_CLASS(error_da_monitor,
#include <monitors/snep/snep_trace.h>
#include <monitors/sncid/sncid_trace.h>
#include <monitors/sts/sts_trace.h>
+#include <monitors/opid/opid_trace.h>
// Add new monitors based on CONFIG_DA_MON_EVENTS_IMPLICIT here
#endif /* CONFIG_DA_MON_EVENTS_IMPLICIT */
diff --git a/tools/verification/models/sched/opid.dot b/tools/verification/models/sched/opid.dot
new file mode 100644
index 000000000000..2d5e1df3405f
--- /dev/null
+++ b/tools/verification/models/sched/opid.dot
@@ -0,0 +1,35 @@
+digraph state_automaton {
+ center = true;
+ size = "7,11";
+ {node [shape = plaintext, style=invis, label=""] "__init_disabled"};
+ {node [shape = circle] "disabled"};
+ {node [shape = doublecircle] "enabled"};
+ {node [shape = circle] "enabled"};
+ {node [shape = circle] "in_irq"};
+ {node [shape = circle] "irq_disabled"};
+ {node [shape = circle] "preempt_disabled"};
+ "__init_disabled" -> "disabled";
+ "disabled" [label = "disabled"];
+ "disabled" -> "disabled" [ label = "sched_need_resched\nsched_waking\nirq_entry" ];
+ "disabled" -> "irq_disabled" [ label = "preempt_enable" ];
+ "disabled" -> "preempt_disabled" [ label = "irq_enable" ];
+ "enabled" [label = "enabled", color = green3];
+ "enabled" -> "enabled" [ label = "preempt_enable" ];
+ "enabled" -> "irq_disabled" [ label = "irq_disable" ];
+ "enabled" -> "preempt_disabled" [ label = "preempt_disable" ];
+ "in_irq" [label = "in_irq"];
+ "in_irq" -> "enabled" [ label = "irq_enable" ];
+ "in_irq" -> "in_irq" [ label = "sched_need_resched\nsched_waking" ];
+ "irq_disabled" [label = "irq_disabled"];
+ "irq_disabled" -> "disabled" [ label = "preempt_disable" ];
+ "irq_disabled" -> "enabled" [ label = "irq_enable" ];
+ "irq_disabled" -> "in_irq" [ label = "irq_entry" ];
+ "irq_disabled" -> "irq_disabled" [ label = "sched_need_resched" ];
+ "preempt_disabled" [label = "preempt_disabled"];
+ "preempt_disabled" -> "disabled" [ label = "irq_disable" ];
+ "preempt_disabled" -> "enabled" [ label = "preempt_enable" ];
+ { rank = min ;
+ "__init_disabled";
+ "disabled";
+ }
+}
--
2.49.0
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 03/12] rv: Add da_handle_start_run_event_ to per-task monitors
2025-05-14 8:43 ` [RFC PATCH v2 03/12] rv: Add da_handle_start_run_event_ to per-task monitors Gabriele Monaco
@ 2025-05-19 8:02 ` Nam Cao
0 siblings, 0 replies; 38+ messages in thread
From: Nam Cao @ 2025-05-19 8:02 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, Ingo Molnar,
Peter Zijlstra, Tomas Glozar, Juri Lelli
On Wed, May 14, 2025 at 10:43:05AM +0200, Gabriele Monaco wrote:
> The RV da_monitor API allows to start monitors in two ways:
> da_handle_start_event_NAME and da_handle_start_run_event_NAME.
> The former is used when the event is followed by the initial state of
> the module, so we ignore the event but we know the monitor is in the
> initial state and can start monitoring, the latter can be used if the
> event can only occur in the initial state, so we do handle the event as
> if the monitor was in the initial state.
> This latter API is defined for implicit monitors but not per-task ones.
>
> Define da_handle_start_run_event_NAME macro also for per-task monitors.
>
> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 04/12] rv: Remove trailing whitespace from tracepoint string
2025-05-14 8:43 ` [RFC PATCH v2 04/12] rv: Remove trailing whitespace from tracepoint string Gabriele Monaco
@ 2025-05-19 8:05 ` Nam Cao
0 siblings, 0 replies; 38+ messages in thread
From: Nam Cao @ 2025-05-19 8:05 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, Masami Hiramatsu,
linux-trace-kernel, Ingo Molnar, Peter Zijlstra, Tomas Glozar,
Juri Lelli
On Wed, May 14, 2025 at 10:43:06AM +0200, Gabriele Monaco wrote:
> RV event tracepoints print a line with the format:
> "event_xyz: S0 x event -> S1 "
> "event_xyz: S1 x event -> S0 (final)"
>
> While printing an event leading to a non-final state, the line
> has a trailing white space (visible above before the closing ").
>
> Adapt the format string not to print the trailing whitespace if we are
> not printing "(final)".
>
> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 05/12] rv: Return init error when registering monitors
2025-05-14 8:43 ` [RFC PATCH v2 05/12] rv: Return init error when registering monitors Gabriele Monaco
@ 2025-05-19 8:06 ` Nam Cao
0 siblings, 0 replies; 38+ messages in thread
From: Nam Cao @ 2025-05-19 8:06 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, Masami Hiramatsu,
linux-trace-kernel, Ingo Molnar, Peter Zijlstra, Tomas Glozar,
Juri Lelli
On Wed, May 14, 2025 at 10:43:07AM +0200, Gabriele Monaco wrote:
> Monitors generated with dot2k have their registration function (the one
> called during monitor initialisation) return always 0, even if the
> registration failed on RV side.
> This can hide potential errors.
>
> Return the value returned by the RV register function.
>
> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Reviewed-by: Nam Cao <namcao@linutronix.de>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 06/12] sched: Adapt sched tracepoints for RV task model
2025-05-14 8:43 ` [RFC PATCH v2 06/12] sched: Adapt sched tracepoints for RV task model Gabriele Monaco
@ 2025-05-19 8:29 ` Nam Cao
2025-05-19 8:41 ` Gabriele Monaco
0 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-05-19 8:29 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Masami Hiramatsu, linux-trace-kernel, Tomas Glozar, Juri Lelli
On Wed, May 14, 2025 at 10:43:08AM +0200, Gabriele Monaco wrote:
> DECLARE_TRACE_CONDITION(sched_set_state_tp,
> - TP_PROTO(struct task_struct *tsk, int state),
> - TP_ARGS(tsk, state),
> + TP_PROTO(struct task_struct *tsk, int state, bool from_signal),
> + TP_ARGS(tsk, state, from_signal),
> TP_CONDITION(!!(tsk->__state) != !!state));
Doesn't this break the build? Because the monitors still use the old
signatures?
I understand you adapt the monitor to this new signature in a follow-up
patch. But every commits in the series should be buildable, otherwise you
break "git bisect".
Best regards,
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 06/12] sched: Adapt sched tracepoints for RV task model
2025-05-19 8:29 ` Nam Cao
@ 2025-05-19 8:41 ` Gabriele Monaco
2025-05-19 8:43 ` Nam Cao
0 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-19 8:41 UTC (permalink / raw)
To: Nam Cao
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Masami Hiramatsu, linux-trace-kernel, Tomas Glozar, Juri Lelli
On Mon, 2025-05-19 at 10:29 +0200, Nam Cao wrote:
> On Wed, May 14, 2025 at 10:43:08AM +0200, Gabriele Monaco wrote:
> > DECLARE_TRACE_CONDITION(sched_set_state_tp,
> > - TP_PROTO(struct task_struct *tsk, int state),
> > - TP_ARGS(tsk, state),
> > + TP_PROTO(struct task_struct *tsk, int state, bool
> > from_signal),
> > + TP_ARGS(tsk, state, from_signal),
> > TP_CONDITION(!!(tsk->__state) != !!state));
>
> Doesn't this break the build? Because the monitors still use the old
> signatures?
>
> I understand you adapt the monitor to this new signature in a follow-
> up
> patch. But every commits in the series should be buildable, otherwise
> you
> break "git bisect".
>
Yeah good point, do you suggest at least fixing signatures in monitors
inside this commit?
I can keep the other commits to actually fix/adapt monitors but at
least allow building from here.
Thanks,
Gabriele
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 07/12] rv: Adapt the sco monitor to the new set_state
2025-05-14 8:43 ` [RFC PATCH v2 07/12] rv: Adapt the sco monitor to the new set_state Gabriele Monaco
@ 2025-05-19 8:42 ` Nam Cao
2025-05-19 9:04 ` Gabriele Monaco
0 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-05-19 8:42 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc, Ingo Molnar, Peter Zijlstra,
Tomas Glozar, Juri Lelli
On Wed, May 14, 2025 at 10:43:09AM +0200, Gabriele Monaco wrote:
> .function = {
> - { thread_context_sco, scheduling_context_sco, INVALID_STATE },
> - { INVALID_STATE, INVALID_STATE, thread_context_sco },
> + { thread_context_sco, INVALID_STATE, scheduling_context_sco, INVALID_STATE },
> + { INVALID_STATE, scheduling_context_sco, INVALID_STATE, thread_context_sco },
This is over the 100 column limit.
I know it is not your fault, this is generated. Back when I was playing
with DA monitor, I made a patch to fix this. Maybe you could include it in
your series?
From b4fb648398a29a9c0d8e08bd12394978d3948a5e Mon Sep 17 00:00:00 2001
From: Nam Cao <namcao@linutronix.de>
Date: Fri, 15 Nov 2024 14:56:33 +0100
Subject: [PATCH] tools/rv/dot2c: Fix generated files going over 100 column
limit
The dot2c.py script generates all states in a single line. This breaks the
100 column limit when the state machines are non-trivial.
Change dot2c.py to generate the states in separate lines.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
tools/verification/rvgen/rvgen/dot2c.py | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/tools/verification/rvgen/rvgen/dot2c.py b/tools/verification/rvgen/rvgen/dot2c.py
index 6009caf568d9..abc0ee569b34 100644
--- a/tools/verification/rvgen/rvgen/dot2c.py
+++ b/tools/verification/rvgen/rvgen/dot2c.py
@@ -152,29 +152,22 @@ class Dot2c(Automata):
max_state_name = max(self.states, key = len).__len__()
return max(max_state_name, self.invalid_state_str.__len__())
- def __get_state_string_length(self):
- maxlen = self.__get_max_strlen_of_states() + self.enum_suffix.__len__()
- return "%" + str(maxlen) + "s"
-
def get_aut_init_function(self):
nr_states = self.states.__len__()
nr_events = self.events.__len__()
buff = []
- strformat = self.__get_state_string_length()
-
for x in range(nr_states):
- line = "\t\t{ "
+ buff.append("\t\t{")
for y in range(nr_events):
next_state = self.function[x][y]
if next_state != self.invalid_state_str:
next_state = self.function[x][y] + self.enum_suffix
if y != nr_events-1:
- line = line + strformat % next_state + ", "
+ buff.append(''.join(("\t\t\t", next_state, ",")))
else:
- line = line + strformat % next_state + " },"
- buff.append(line)
+ buff.append(''.join(("\t\t\t", next_state, "\n\t\t},")))
return self.__buff_to_string(buff)
--
2.39.5
Best regards,
Nam
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 06/12] sched: Adapt sched tracepoints for RV task model
2025-05-19 8:41 ` Gabriele Monaco
@ 2025-05-19 8:43 ` Nam Cao
0 siblings, 0 replies; 38+ messages in thread
From: Nam Cao @ 2025-05-19 8:43 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Masami Hiramatsu, linux-trace-kernel, Tomas Glozar, Juri Lelli
On Mon, May 19, 2025 at 10:41:47AM +0200, Gabriele Monaco wrote:
> On Mon, 2025-05-19 at 10:29 +0200, Nam Cao wrote:
> > On Wed, May 14, 2025 at 10:43:08AM +0200, Gabriele Monaco wrote:
> > > DECLARE_TRACE_CONDITION(sched_set_state_tp,
> > > - TP_PROTO(struct task_struct *tsk, int state),
> > > - TP_ARGS(tsk, state),
> > > + TP_PROTO(struct task_struct *tsk, int state, bool
> > > from_signal),
> > > + TP_ARGS(tsk, state, from_signal),
> > > TP_CONDITION(!!(tsk->__state) != !!state));
> >
> > Doesn't this break the build? Because the monitors still use the old
> > signatures?
> >
> > I understand you adapt the monitor to this new signature in a follow-
> > up
> > patch. But every commits in the series should be buildable, otherwise
> > you
> > break "git bisect".
> >
>
> Yeah good point, do you suggest at least fixing signatures in monitors
> inside this commit?
Yes.
> I can keep the other commits to actually fix/adapt monitors but at
> least allow building from here.
Also yes, I would only change the signature in this commit.
Best regards,
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 07/12] rv: Adapt the sco monitor to the new set_state
2025-05-19 8:42 ` Nam Cao
@ 2025-05-19 9:04 ` Gabriele Monaco
0 siblings, 0 replies; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-19 9:04 UTC (permalink / raw)
To: Nam Cao
Cc: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc, Ingo Molnar, Peter Zijlstra,
Tomas Glozar, Juri Lelli
On Mon, 2025-05-19 at 10:42 +0200, Nam Cao wrote:
> On Wed, May 14, 2025 at 10:43:09AM +0200, Gabriele Monaco wrote:
> > .function = {
> > - { thread_context_sco,
> > scheduling_context_sco, INVALID_STATE },
> > - { INVALID_STATE,
> > INVALID_STATE, thread_context_sco },
> > + { thread_context_sco, INVALID_STATE,
> > scheduling_context_sco, INVALID_STATE },
> > + { INVALID_STATE,
> > scheduling_context_sco, INVALID_STATE,
> > thread_context_sco },
>
> This is over the 100 column limit.
>
> I know it is not your fault, this is generated. Back when I was
> playing
> with DA monitor, I made a patch to fix this. Maybe you could include
> it in
> your series?
>
> From b4fb648398a29a9c0d8e08bd12394978d3948a5e Mon Sep 17 00:00:00
> 2001
> From: Nam Cao <namcao@linutronix.de>
> Date: Fri, 15 Nov 2024 14:56:33 +0100
> Subject: [PATCH] tools/rv/dot2c: Fix generated files going over 100
> column
> limit
>
> The dot2c.py script generates all states in a single line. This
> breaks the
> 100 column limit when the state machines are non-trivial.
>
> Change dot2c.py to generate the states in separate lines.
>
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
> tools/verification/rvgen/rvgen/dot2c.py | 13 +++----------
> 1 file changed, 3 insertions(+), 10 deletions(-)
>
> diff --git a/tools/verification/rvgen/rvgen/dot2c.py
> b/tools/verification/rvgen/rvgen/dot2c.py
> index 6009caf568d9..abc0ee569b34 100644
> --- a/tools/verification/rvgen/rvgen/dot2c.py
> +++ b/tools/verification/rvgen/rvgen/dot2c.py
> @@ -152,29 +152,22 @@ class Dot2c(Automata):
> max_state_name = max(self.states, key = len).__len__()
> return max(max_state_name, self.invalid_state_str.__len__())
>
> - def __get_state_string_length(self):
> - maxlen = self.__get_max_strlen_of_states() +
> self.enum_suffix.__len__()
> - return "%" + str(maxlen) + "s"
> -
> def get_aut_init_function(self):
> nr_states = self.states.__len__()
> nr_events = self.events.__len__()
> buff = []
>
> - strformat = self.__get_state_string_length()
> -
> for x in range(nr_states):
> - line = "\t\t{ "
> + buff.append("\t\t{")
> for y in range(nr_events):
> next_state = self.function[x][y]
> if next_state != self.invalid_state_str:
> next_state = self.function[x][y] +
> self.enum_suffix
>
> if y != nr_events-1:
> - line = line + strformat % next_state + ", "
> + buff.append(''.join(("\t\t\t", next_state,
> ",")))
> else:
> - line = line + strformat % next_state + " },"
> - buff.append(line)
> + buff.append(''.join(("\t\t\t", next_state,
> "\n\t\t},")))
>
> return self.__buff_to_string(buff)
>
Thanks for bringing this up, I'm a bit undecided on this one..
The nice thing of the current representation is that it shows a matrix
and it's relatively easy to see what each event does.
On the other hand it's true larger models do exceed quite a bit the
size limits and considering you aren't really supposed to touch this
file directly (as the script does it for you), perhaps cleaner C code
should be the priority.
I'll play with your patch and see if it negatively affects the workflow
in any way. If not, I'd include it and adapt the monitors (perhaps only
those with long lines, not really need to change all).
Thanks,
Gabriele
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions
2025-05-14 8:43 ` [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions Gabriele Monaco
@ 2025-05-19 9:06 ` Nam Cao
2025-05-19 10:28 ` Gabriele Monaco
0 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-05-19 9:06 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, Ingo Molnar,
Peter Zijlstra, Tomas Glozar, Juri Lelli
On Wed, May 14, 2025 at 10:43:12AM +0200, Gabriele Monaco wrote:
> -static inline void \
> -da_monitor_set_state_##name(struct da_monitor *da_mon, enum states_##name state) \
> +static inline bool \
> +da_monitor_set_state_##name(struct da_monitor *da_mon, enum states_##name prev_state, \
> + enum states_##name state) \
> { \
> - da_mon->curr_state = state; \
> + return try_cmpxchg(&da_mon->curr_state, &prev_state, state); \
> } \
This is a very thin wrapper. Should we just call try_cmpxchg() directly?
> static inline bool \
> da_event_##name(struct da_monitor *da_mon, enum events_##name event) \
> { \
> - type curr_state = da_monitor_curr_state_##name(da_mon); \
> - type next_state = model_get_next_state_##name(curr_state, event); \
> + bool changed; \
> + type curr_state, next_state; \
> \
> - if (next_state != INVALID_STATE) { \
> - da_monitor_set_state_##name(da_mon, next_state); \
> + for (int i = 0; i < MAX_DA_RETRY_RACING_EVENTS; i++) { \
> + curr_state = da_monitor_curr_state_##name(da_mon); \
For the follow-up iterations (i > 0), it is not necessary to read
curr_state again here, we already have its value from try_cmpxchg() below.
Also, thinking about memory barrier hurts my main, but I'm not entirely
sure if reading curr_state without memory barrier here is okay.
How about something like below?
curr_state = da_monitor_curr_state_##name(da_mon);
for (int i = 0; i < MAX_DA_RETRY_RACING_EVENTS; i++) {
next_state = model_get_next_state_##name(curr_state, event);
if (next_state == INVALID_STATE)
break;
if (try_cmpxchg(&da_mon->curr_state, &curr_state, next_state))
break;
}
Furthermore, it is possible to replace for(...) with while (1)? I don't
think we can have a live lock, because if we fail to do try_cmpxchg(),
the "other guy" surely succeed.
Best regards,
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions
2025-05-19 9:06 ` Nam Cao
@ 2025-05-19 10:28 ` Gabriele Monaco
2025-05-19 10:38 ` Nam Cao
0 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-19 10:28 UTC (permalink / raw)
To: Nam Cao
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, Ingo Molnar,
Peter Zijlstra, Tomas Glozar, Juri Lelli
On Mon, 2025-05-19 at 11:06 +0200, Nam Cao wrote:
> On Wed, May 14, 2025 at 10:43:12AM +0200, Gabriele Monaco wrote:
> > -static inline
> > void \
> > -da_monitor_set_state_##name(struct da_monitor *da_mon, enum
> > states_##name state) \
> > +static inline
> > bool \
> > +da_monitor_set_state_##name(struct da_monitor *da_mon, enum
> > states_##name prev_state, \
> > + enum states_##name
> > state) \
> > {
> > \
> > - da_mon->curr_state =
> > state; \
> > + return try_cmpxchg(&da_mon->curr_state, &prev_state,
> > state); \
> > }
> > \
>
> This is a very thin wrapper. Should we just call try_cmpxchg()
> directly?
Mmh, right, at this point the wrapper does nothing but making the code
more obscure, will do.
>
> > static inline
> > bool \
> > da_event_##name(struct da_monitor *da_mon, enum events_##name
> > event) \
> > {
> > \
> > - type curr_state =
> > da_monitor_curr_state_##name(da_mon); \
> > - type next_state = model_get_next_state_##name(curr_state,
> > event); \
> > + bool
> > changed; \
> > + type curr_state,
> > next_state; \
> >
> > \
> > - if (next_state != INVALID_STATE)
> > { \
> > - da_monitor_set_state_##name(da_mon,
> > next_state); \
> > + for (int i = 0; i < MAX_DA_RETRY_RACING_EVENTS; i++)
> > { \
> > + curr_state =
> > da_monitor_curr_state_##name(da_mon); \
>
> For the follow-up iterations (i > 0), it is not necessary to read
> curr_state again here, we already have its value from try_cmpxchg()
> below.
Yeah good point.
>
> Also, thinking about memory barrier hurts my main, but I'm not
> entirely
> sure if reading curr_state without memory barrier here is okay.
>
I guess we are on the same boat here. I couldn't really understand how
much of a barrier is the try_cmpxchg imposing (if any), but didn't see
any noticeable difference adding an explicit smp write barrier to pair
with the READ_ONCE in da_monitor_curr_state, so straight assumed we can
do without it.
But I definitely appreciate opinions on this.
> How about something like below?
>
> curr_state = da_monitor_curr_state_##name(da_mon);
> for (int i = 0; i < MAX_DA_RETRY_RACING_EVENTS; i++) {
> next_state = model_get_next_state_##name(curr_state, event);
> if (next_state == INVALID_STATE)
> break;
> if (try_cmpxchg(&da_mon->curr_state, &curr_state,
> next_state))
> break;
> }
>
Yeah, that's neater.
> Furthermore, it is possible to replace for(...) with while (1)? I
> don't
> think we can have a live lock, because if we fail to do
> try_cmpxchg(),
> the "other guy" surely succeed.
>
Mmh, although definitely unlikely, I'm thinking of a case in which the
event starts on one CPU and at the same time we see events in IRQ and
on another CPU, let's say continuously. Nothing forbids that between
any two consecutive try_cmpxchg another CPU/context changes the next
state (making the local try_cmpxchg fail).
In practice I've never seen it going on the second iteration, as the
critical section is really tiny, but I'm not sure we can guarantee this
never happens.
Or am I missing something?
Thanks,
Gabriele
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions
2025-05-19 10:28 ` Gabriele Monaco
@ 2025-05-19 10:38 ` Nam Cao
2025-05-19 11:13 ` Gabriele Monaco
0 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-05-19 10:38 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, Ingo Molnar,
Peter Zijlstra, Tomas Glozar, Juri Lelli
On Mon, May 19, 2025 at 12:28:12PM +0200, Gabriele Monaco wrote:
> Mmh, although definitely unlikely, I'm thinking of a case in which the
> event starts on one CPU and at the same time we see events in IRQ and
> on another CPU, let's say continuously. Nothing forbids that between
> any two consecutive try_cmpxchg another CPU/context changes the next
> state (making the local try_cmpxchg fail).
> In practice I've never seen it going on the second iteration, as the
> critical section is really tiny, but I'm not sure we can guarantee this
> never happens.
> Or am I missing something?
I have a feeling that you missed my point. I agree that the retrying is
needed, because we may race with another.
What I am proposing is that we drop the MAX_DA_RETRY_RACING_EVENTS, and
just keep retrying until we succeed.
And that's safe to do, because the maximum number of retries is the number
of tasks contending with us to set the monitor's state. So we know we won't
be retrying for long.
Best regards,
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions
2025-05-19 10:38 ` Nam Cao
@ 2025-05-19 11:13 ` Gabriele Monaco
2025-05-21 6:58 ` Nam Cao
0 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-19 11:13 UTC (permalink / raw)
To: Nam Cao
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, Ingo Molnar,
Peter Zijlstra, Tomas Glozar, Juri Lelli
On Mon, 2025-05-19 at 12:38 +0200, Nam Cao wrote:
> On Mon, May 19, 2025 at 12:28:12PM +0200, Gabriele Monaco wrote:
> > Mmh, although definitely unlikely, I'm thinking of a case in which
> > the
> > event starts on one CPU and at the same time we see events in IRQ
> > and
> > on another CPU, let's say continuously. Nothing forbids that
> > between
> > any two consecutive try_cmpxchg another CPU/context changes the
> > next
> > state (making the local try_cmpxchg fail).
> > In practice I've never seen it going on the second iteration, as
> > the
> > critical section is really tiny, but I'm not sure we can guarantee
> > this
> > never happens.
> > Or am I missing something?
>
> I have a feeling that you missed my point. I agree that the retrying
> is
> needed, because we may race with another.
>
> What I am proposing is that we drop the MAX_DA_RETRY_RACING_EVENTS,
> and
> just keep retrying until we succeed.
>
> And that's safe to do, because the maximum number of retries is the
> number
> of tasks contending with us to set the monitor's state. So we know we
> won't
> be retrying for long.
I get this point, what I mean is: can we really guarantee the number of
contending tasks (or contexts) is finite?
In other words, the try_cmpxchg guarantees 1 and only 1 actor wins
every time, but cannot guarantee all actors will eventually win, an
actor /could/ be hanging there forever.
This handler is running for each event in the monitor and tracepoint
handlers can be interrupted as well as run in interrupt context (where
of course they cannot be interrupted). I don't think the number of
actors is bounded by the number of CPUs.
I see this situation is extremely unlikely, but in an exotic scenario
where a CPU is sufficiently slower than others (e.g. in a VM) I believe
we can see this critical section large enough for this to potentially
happen.
I'm not quite afraid of infinite loops, but rather RV introducing
unbounded latency very hard to track and without any reporting.
Chances are, since tracepoints and actual traced events are not atomic,
that by the time this delayed context /wins/ the RV event is no longer
current, so we may see an error already.
Does it make sense to you or am I making it more complex than it should
be?
Thanks,
Gabriele
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions
2025-05-19 11:13 ` Gabriele Monaco
@ 2025-05-21 6:58 ` Nam Cao
0 siblings, 0 replies; 38+ messages in thread
From: Nam Cao @ 2025-05-21 6:58 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, Ingo Molnar,
Peter Zijlstra, Tomas Glozar, Juri Lelli
On Mon, May 19, 2025 at 01:13:01PM +0200, Gabriele Monaco wrote:
>
>
> On Mon, 2025-05-19 at 12:38 +0200, Nam Cao wrote:
> > On Mon, May 19, 2025 at 12:28:12PM +0200, Gabriele Monaco wrote:
> > > Mmh, although definitely unlikely, I'm thinking of a case in which
> > > the
> > > event starts on one CPU and at the same time we see events in IRQ
> > > and
> > > on another CPU, let's say continuously. Nothing forbids that
> > > between
> > > any two consecutive try_cmpxchg another CPU/context changes the
> > > next
> > > state (making the local try_cmpxchg fail).
> > > In practice I've never seen it going on the second iteration, as
> > > the
> > > critical section is really tiny, but I'm not sure we can guarantee
> > > this
> > > never happens.
> > > Or am I missing something?
> >
> > I have a feeling that you missed my point. I agree that the retrying
> > is
> > needed, because we may race with another.
> >
> > What I am proposing is that we drop the MAX_DA_RETRY_RACING_EVENTS,
> > and
> > just keep retrying until we succeed.
> >
> > And that's safe to do, because the maximum number of retries is the
> > number
> > of tasks contending with us to set the monitor's state. So we know we
> > won't
> > be retrying for long.
>
> I get this point, what I mean is: can we really guarantee the number of
> contending tasks (or contexts) is finite?
> In other words, the try_cmpxchg guarantees 1 and only 1 actor wins
> every time, but cannot guarantee all actors will eventually win, an
> actor /could/ be hanging there forever.
>
> This handler is running for each event in the monitor and tracepoint
> handlers can be interrupted as well as run in interrupt context (where
> of course they cannot be interrupted). I don't think the number of
> actors is bounded by the number of CPUs.
> I see this situation is extremely unlikely, but in an exotic scenario
> where a CPU is sufficiently slower than others (e.g. in a VM) I believe
> we can see this critical section large enough for this to potentially
> happen.
>
> I'm not quite afraid of infinite loops, but rather RV introducing
> unbounded latency very hard to track and without any reporting.
> Chances are, since tracepoints and actual traced events are not atomic,
> that by the time this delayed context /wins/ the RV event is no longer
> current, so we may see an error already.
>
> Does it make sense to you or am I making it more complex than it should
> be?
Right, I can see that being a problem. But I don't know enough about it to
comment further, so do as you think best, maybe someone else can help.
Best regards,
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 00/12] rv: Add monitors to validate task switch
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
` (11 preceding siblings ...)
2025-05-14 8:43 ` [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor Gabriele Monaco
@ 2025-05-21 7:15 ` Nam Cao
2025-05-21 7:31 ` Gabriele Monaco
12 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-05-21 7:15 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Tomas Glozar, Juri Lelli
On Wed, May 14, 2025 at 10:43:02AM +0200, Gabriele Monaco wrote:
> I'm keeping this as RFC as I'm planning to make it ready after merging
> the LTL series [1]
>
> This series adds three monitors to the sched collection, extends and
> replaces previously existing monitors:
What is your base? I cannot apply the series (maybe because I'm illiterate
on git...)
$ b4 am 20250514084314.57976-1-gmonaco@redhat.com
...
$ git am -3 ./v2_20250514_gmonaco_rv_add_monitors_to_validate_task_switch.mbx
Applying: tools/rv: Do not skip idle in trace
Applying: tools/rv: Stop gracefully also on SIGTERM
Applying: rv: Add da_handle_start_run_event_ to per-task monitors
Applying: rv: Remove trailing whitespace from tracepoint string
Applying: rv: Return init error when registering monitors
Applying: sched: Adapt sched tracepoints for RV task model
error: sha1 information is lacking or useless (include/linux/sched.h).
error: could not build fake ancestor
Patch failed at 0006 sched: Adapt sched tracepoints for RV task model
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 00/12] rv: Add monitors to validate task switch
2025-05-21 7:15 ` [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Nam Cao
@ 2025-05-21 7:31 ` Gabriele Monaco
2025-05-27 13:51 ` Nam Cao
0 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-21 7:31 UTC (permalink / raw)
To: Nam Cao
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Tomas Glozar, Juri Lelli
On Wed, 2025-05-21 at 09:15 +0200, Nam Cao wrote:
> On Wed, May 14, 2025 at 10:43:02AM +0200, Gabriele Monaco wrote:
> > I'm keeping this as RFC as I'm planning to make it ready after
> > merging
> > the LTL series [1]
> >
> > This series adds three monitors to the sched collection, extends
> > and
> > replaces previously existing monitors:
>
> What is your base? I cannot apply the series (maybe because I'm
> illiterate
> on git...)
>
> $ b4 am 20250514084314.57976-1-gmonaco@redhat.com
> ...
> $ git am -3
> ./v2_20250514_gmonaco_rv_add_monitors_to_validate_task_switch.mbx
> Applying: tools/rv: Do not skip idle in trace
> Applying: tools/rv: Stop gracefully also on SIGTERM
> Applying: rv: Add da_handle_start_run_event_ to per-task monitors
> Applying: rv: Remove trailing whitespace from tracepoint string
> Applying: rv: Return init error when registering monitors
> Applying: sched: Adapt sched tracepoints for RV task model
> error: sha1 information is lacking or useless
> (include/linux/sched.h).
> error: could not build fake ancestor
> Patch failed at 0006 sched: Adapt sched tracepoints for RV task model
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> hint: When you have resolved this problem, run "git am --continue".
> hint: If you prefer to skip this patch, run "git am --skip" instead.
> hint: To restore the original branch and stop patching, run "git am -
> -abort".
> hint: Disable this message with "git config set advice.mergeConflict
> false"
Mmh, it /was/ linux-next, I'm not sure if that gets rebased and breaks
stuff, but it shouldn't.
I think it should apply cleanly on any up-to-date tree provided you
also apply
https://lore.kernel.org/lkml/174413914101.31282.6876925851363406689.tip-bot2@tip-bot2
(which is on next, so it's not required if that's your tree).
If that doesn't help, my base was
aa94665adc28f3fdc3de2979ac1e98bae961d6ca (tag: next-20250513), I
included it in 00/12 but I'm not sure what kind of tool reads that.
I guess next time I should add Peter's patch to the series too to save
people some headache though..
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor
2025-05-14 8:43 ` [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor Gabriele Monaco
@ 2025-05-27 13:37 ` Nam Cao
2025-05-27 14:35 ` Gabriele Monaco
0 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-05-27 13:37 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc, Ingo Molnar, Peter Zijlstra,
Tomas Glozar, Juri Lelli
On Wed, May 14, 2025 at 10:43:14AM +0200, Gabriele Monaco wrote:
> Add a per-cpu monitor as part of the sched model:
> * opid: operations with preemption and irq disabled
> Monitor to ensure wakeup and need_resched occur with irq and
> preemption disabled or in irq handlers.
This monitor reports some warnings:
$ perf record -e rv:error_opid --call-graph dwarf -a -- ./stress-epoll
(stress-epoll program from
https://github.com/rouming/test-tools/blob/master/stress-epoll.c)
$ perf script
stress-epoll 315 [003] 527.674724: rv:error_opid: event preempt_disable not expected in the state preempt_disabled
ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
ffffffff9fdfba0d handle_preempt_disable+0x3d ([kernel.kallsyms])
ffffffff9fdd32d0 __traceiter_preempt_disable+0x30 ([kernel.kallsyms])
ffffffff9fdd38fe trace_preempt_off+0x4e ([kernel.kallsyms])
ffffffff9fee6c1c vfs_write+0x12c ([kernel.kallsyms])
ffffffff9fee7128 ksys_write+0x68 ([kernel.kallsyms])
ffffffffa0bdbd92 do_syscall_64+0xb2 ([kernel.kallsyms])
ffffffff9fa00130 entry_SYSCALL_64_after_hwframe+0x77 ([kernel.kallsyms])
f833f __GI___libc_write+0x4f (/usr/lib/x86_64-linux-gnu/libc.so.6)
f833f __GI___libc_write+0x4f (/usr/lib/x86_64-linux-gnu/libc.so.6)
1937 thread_work+0x47 (/root/test-tools/stress-epoll)
891f4 start_thread+0x304 (/usr/lib/x86_64-linux-gnu/libc.so.6)
10989b clone3+0x2b (/usr/lib/x86_64-linux-gnu/libc.so.6)
stress-epoll 318 [002] 527.674759: rv:error_opid: event preempt_disable not expected in the state disabled
ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
ffffffff9fdfba0d handle_preempt_disable+0x3d ([kernel.kallsyms])
ffffffff9fdd32d0 __traceiter_preempt_disable+0x30 ([kernel.kallsyms])
ffffffff9fdd38fe trace_preempt_off+0x4e ([kernel.kallsyms])
ffffffffa0bec1aa _raw_spin_lock_irq+0x1a ([kernel.kallsyms])
ffffffff9ff4fe73 eventfd_write+0x63 ([kernel.kallsyms])
ffffffff9fee6be5 vfs_write+0xf5 ([kernel.kallsyms])
ffffffff9fee7128 ksys_write+0x68 ([kernel.kallsyms])
ffffffffa0bdbd92 do_syscall_64+0xb2 ([kernel.kallsyms])
ffffffff9fa00130 entry_SYSCALL_64_after_hwframe+0x77 ([kernel.kallsyms])
f833f __GI___libc_write+0x4f (/usr/lib/x86_64-linux-gnu/libc.so.6)
f833f __GI___libc_write+0x4f (/usr/lib/x86_64-linux-gnu/libc.so.6)
1937 thread_work+0x47 (/root/test-tools/stress-epoll)
891f4 start_thread+0x304 (/usr/lib/x86_64-linux-gnu/libc.so.6)
10989b clone3+0x2b (/usr/lib/x86_64-linux-gnu/libc.so.6)
I'm not sure what I'm looking at here. Do you think these are kernel bugs,
or the monitor is missing some corner cases?
Best regards,
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 00/12] rv: Add monitors to validate task switch
2025-05-21 7:31 ` Gabriele Monaco
@ 2025-05-27 13:51 ` Nam Cao
0 siblings, 0 replies; 38+ messages in thread
From: Nam Cao @ 2025-05-27 13:51 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Tomas Glozar, Juri Lelli
On Wed, May 21, 2025 at 09:31:38AM +0200, Gabriele Monaco wrote:
> On Wed, 2025-05-21 at 09:15 +0200, Nam Cao wrote:
> > On Wed, May 14, 2025 at 10:43:02AM +0200, Gabriele Monaco wrote:
> > > I'm keeping this as RFC as I'm planning to make it ready after
> > > merging
> > > the LTL series [1]
> > >
> > > This series adds three monitors to the sched collection, extends
> > > and
> > > replaces previously existing monitors:
> >
> > What is your base? I cannot apply the series (maybe because I'm
> > illiterate
> > on git...)
>
> Mmh, it /was/ linux-next, I'm not sure if that gets rebased and breaks
> stuff, but it shouldn't.
Turned out it was because this series conflicts with ac01fa73f530
("tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp
suffix")
I resolved the conflict while applying, and all is good.
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor
2025-05-27 13:37 ` Nam Cao
@ 2025-05-27 14:35 ` Gabriele Monaco
2025-05-27 14:50 ` Nam Cao
0 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-27 14:35 UTC (permalink / raw)
To: Nam Cao
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, linux-doc,
Ingo Molnar, Peter Zijlstra, Tomas Glozar, Juri Lelli
On Tue, 2025-05-27 at 15:37 +0200, Nam Cao wrote:
> On Wed, May 14, 2025 at 10:43:14AM +0200, Gabriele Monaco wrote:
> > Add a per-cpu monitor as part of the sched model:
> > * opid: operations with preemption and irq disabled
> > Monitor to ensure wakeup and need_resched occur with irq and
> > preemption disabled or in irq handlers.
>
> This monitor reports some warnings:
>
> $ perf record -e rv:error_opid --call-graph dwarf -a -- ./stress-
> epoll
> (stress-epoll program from
> https://github.com/rouming/test-tools/blob/master/stress-epoll.c)
>
Thanks for trying it out, and good to know about this stressor.
Unfortunately it's a bit hard to understand from this stack trace, but
that's very likely a problem in the model.
I have a few ideas where that could be but I believe it's something
visible only on a physical machine (haven't tested much on x86 bare
metal, only VM).
You're running on bare metal right?
> $ perf script
> stress-epoll 315 [003] 527.674724: rv:error_opid: event
> preempt_disable not expected in the state preempt_disabled
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfba0d handle_preempt_disable+0x3d
> ([kernel.kallsyms])
> ffffffff9fdd32d0 __traceiter_preempt_disable+0x30
> ([kernel.kallsyms])
> ffffffff9fdd38fe trace_preempt_off+0x4e ([kernel.kallsyms])
> ffffffff9fee6c1c vfs_write+0x12c ([kernel.kallsyms])
> ffffffff9fee7128 ksys_write+0x68 ([kernel.kallsyms])
> ffffffffa0bdbd92 do_syscall_64+0xb2 ([kernel.kallsyms])
> ffffffff9fa00130 entry_SYSCALL_64_after_hwframe+0x77
> ([kernel.kallsyms])
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> 1937 thread_work+0x47 (/root/test-tools/stress-
> epoll)
> 891f4 start_thread+0x304 (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
> 10989b clone3+0x2b (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
>
> stress-epoll 318 [002] 527.674759: rv:error_opid: event
> preempt_disable not expected in the state disabled
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfb34f da_event_opid+0x10f ([kernel.kallsyms])
> ffffffff9fdfba0d handle_preempt_disable+0x3d
> ([kernel.kallsyms])
> ffffffff9fdd32d0 __traceiter_preempt_disable+0x30
> ([kernel.kallsyms])
> ffffffff9fdd38fe trace_preempt_off+0x4e ([kernel.kallsyms])
> ffffffffa0bec1aa _raw_spin_lock_irq+0x1a ([kernel.kallsyms])
> ffffffff9ff4fe73 eventfd_write+0x63 ([kernel.kallsyms])
> ffffffff9fee6be5 vfs_write+0xf5 ([kernel.kallsyms])
> ffffffff9fee7128 ksys_write+0x68 ([kernel.kallsyms])
> ffffffffa0bdbd92 do_syscall_64+0xb2 ([kernel.kallsyms])
> ffffffff9fa00130 entry_SYSCALL_64_after_hwframe+0x77
> ([kernel.kallsyms])
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> f833f __GI___libc_write+0x4f (/usr/lib/x86_64-
> linux-gnu/libc.so.6)
> 1937 thread_work+0x47 (/root/test-tools/stress-
> epoll)
> 891f4 start_thread+0x304 (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
> 10989b clone3+0x2b (/usr/lib/x86_64-linux-
> gnu/libc.so.6)
>
> I'm not sure what I'm looking at here. Do you think these are kernel
> bugs,
> or the monitor is missing some corner cases?
>
As said, likely a missing corner case, I believe it has to do with IRQs
(which is what makes this monitor more complex than it could be).
Thanks for the pointers, I'll try reproduce it this way.
Gabriele
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor
2025-05-27 14:35 ` Gabriele Monaco
@ 2025-05-27 14:50 ` Nam Cao
2025-05-28 11:27 ` Gabriele Monaco
0 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-05-27 14:50 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, linux-doc,
Ingo Molnar, Peter Zijlstra, Tomas Glozar, Juri Lelli
On Tue, May 27, 2025 at 04:35:04PM +0200, Gabriele Monaco wrote:
> Thanks for trying it out, and good to know about this stressor.
> Unfortunately it's a bit hard to understand from this stack trace, but
> that's very likely a problem in the model. I have a few ideas where that
> could be but I believe it's something visible only on a physical machine
> (haven't tested much on x86 bare metal, only VM).
>
> You're running on bare metal right?
No, it's QEMU:
qemu-system-x86_64 -enable-kvm -m 2048 -smp 4 \
-nographic \
-drive if=virtio,format=raw,file=bookworm.img \
-kernel /srv/work/namcao/linux/arch/x86/boot/bzImage \
-append "console=ttyS0 root=/dev/vda rw" \
The kernel is just x86 defconfig + the monitors.
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor
2025-05-27 14:50 ` Nam Cao
@ 2025-05-28 11:27 ` Gabriele Monaco
0 siblings, 0 replies; 38+ messages in thread
From: Gabriele Monaco @ 2025-05-28 11:27 UTC (permalink / raw)
To: Nam Cao
Cc: linux-kernel, Steven Rostedt, linux-trace-kernel, linux-doc,
Ingo Molnar, Peter Zijlstra, Tomas Glozar, Juri Lelli
On Tue, 2025-05-27 at 16:50 +0200, Nam Cao wrote:
> On Tue, May 27, 2025 at 04:35:04PM +0200, Gabriele Monaco wrote:
> > Thanks for trying it out, and good to know about this stressor.
> > Unfortunately it's a bit hard to understand from this stack trace,
> > but
> > that's very likely a problem in the model. I have a few ideas
> > where that
> > could be but I believe it's something visible only on a physical
> > machine
> > (haven't tested much on x86 bare metal, only VM).
> >
> > You're running on bare metal right?
>
> No, it's QEMU:
>
> qemu-system-x86_64 -enable-kvm -m 2048 -smp 4 \
> -nographic \
> -drive if=virtio,format=raw,file=bookworm.img \
> -kernel /srv/work/namcao/linux/arch/x86/boot/bzImage \
> -append "console=ttyS0 root=/dev/vda rw" \
>
> The kernel is just x86 defconfig + the monitors.
>
Apparently the error is visible on non-PREEMPT_RT only, the models are
designed for preempt-rt and I didn't really test them elsewhere.
Not sure if it's worth tailoring them for non RT kernels, but for now I
can just mark those monitors as RT-only via Kconfig.
Especially this type of monitors is describing very accurately the
preemptiveness of some events, I wouldn't be too surprised if some
rules don't hold in all preempt configurations.
The idea is that, as long as the models stand true, some assumptions
about latency can be made, on the long run this type of assumptions is
likely different across preemption models.
That said, it might be a stupid mistake as well, so I'd look into that
more closely ;)
Thanks again,
Gabriele
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts
2025-05-14 8:43 ` [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts Gabriele Monaco
@ 2025-06-24 7:36 ` Nam Cao
2025-06-24 14:44 ` Gabriele Monaco
0 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-06-24 7:36 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc, Ingo Molnar, Peter Zijlstra,
Tomas Glozar, Juri Lelli, john.ogness
On Wed, May 14, 2025 at 10:43:11AM +0200, Gabriele Monaco wrote:
> diff --git a/kernel/trace/rv/monitors/tss/tss_trace.h b/kernel/trace/rv/monitors/sts/sts_trace.h
> similarity index 67%
> rename from kernel/trace/rv/monitors/tss/tss_trace.h
> rename to kernel/trace/rv/monitors/sts/sts_trace.h
> index 4619dbb50cc0..d78beb58d5b3 100644
> --- a/kernel/trace/rv/monitors/tss/tss_trace.h
> +++ b/kernel/trace/rv/monitors/sts/sts_trace.h
> @@ -4,12 +4,12 @@
> * Snippet to be included in rv_trace.h
> */
>
> -#ifdef CONFIG_RV_MON_TSS
> -DEFINE_EVENT(event_da_monitor, event_tss,
> +#ifdef CONFIG_RV_MON_STS
> +DEFINE_EVENT(event_da_monitor, event_sts,
> TP_PROTO(char *state, char *event, char *next_state, bool final_state),
> TP_ARGS(state, event, next_state, final_state));
>
> -DEFINE_EVENT(error_da_monitor, error_tss,
> +DEFINE_EVENT(error_da_monitor, error_sts,
> TP_PROTO(char *state, char *event),
> TP_ARGS(state, event));
> -#endif /* CONFIG_RV_MON_TSS */
> +#endif /* CONFIG_RV_MON_STS */
You are changing the tracepoint's name. Should we worry about breaking
userspace?
It probably doesn't matter at the moment, because I doubt anyone is really
relying on this tracepoint. But I think we should have a definite stance on
this, for future references.
I have seen tracepoints being changed (I know of [1][2][3], I was one of
them :P), so it seems to be considered okay. But adding userspace tools to
the equation and it doesn't make sense to me. For example, lttng is using
the page_fault tracepoints [4], which is broken by [3].
If this should be stable user API, then we should starting thinking about
better API which allows changes like this to happen. Otherwise, they should
be clearly documented to be unstable.
(I think I may also need to change my rtapp's tracepoint names at some point
in the future, that's why I am asking)
Best regards,
Nam
[1] commit dbb6ecb328cb ("btrfs: tracepoints: simplify raid56 events")
[2] commit 244132c4e577 ("tracing/timers: Rename the hrtimer_init event to hrtimer_setup")
[3] https://lore.kernel.org/lkml/2dda8c03-072a-43b2-af0c-bb996d64c388@cs.wisc.edu/#t
[4] https://github.com/lttng/lttng-modules/blob/master/include/instrumentation/events/arch/x86/exceptions.h#L88C48-L88C63
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts
2025-06-24 7:36 ` Nam Cao
@ 2025-06-24 14:44 ` Gabriele Monaco
2025-06-24 15:50 ` Nam Cao
0 siblings, 1 reply; 38+ messages in thread
From: Gabriele Monaco @ 2025-06-24 14:44 UTC (permalink / raw)
To: Nam Cao
Cc: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc, Ingo Molnar, Peter Zijlstra,
Tomas Glozar, Juri Lelli, john.ogness
On Tue, 2025-06-24 at 09:36 +0200, Nam Cao wrote:
> On Wed, May 14, 2025 at 10:43:11AM +0200, Gabriele Monaco wrote:
> > diff --git a/kernel/trace/rv/monitors/tss/tss_trace.h
> > b/kernel/trace/rv/monitors/sts/sts_trace.h
> > similarity index 67%
> > rename from kernel/trace/rv/monitors/tss/tss_trace.h
> > rename to kernel/trace/rv/monitors/sts/sts_trace.h
> > index 4619dbb50cc0..d78beb58d5b3 100644
> > --- a/kernel/trace/rv/monitors/tss/tss_trace.h
> > +++ b/kernel/trace/rv/monitors/sts/sts_trace.h
> > @@ -4,12 +4,12 @@
> > * Snippet to be included in rv_trace.h
> > */
> >
> > -#ifdef CONFIG_RV_MON_TSS
> > -DEFINE_EVENT(event_da_monitor, event_tss,
> > +#ifdef CONFIG_RV_MON_STS
> > +DEFINE_EVENT(event_da_monitor, event_sts,
> > TP_PROTO(char *state, char *event, char *next_state,
> > bool final_state),
> > TP_ARGS(state, event, next_state, final_state));
> >
> > -DEFINE_EVENT(error_da_monitor, error_tss,
> > +DEFINE_EVENT(error_da_monitor, error_sts,
> > TP_PROTO(char *state, char *event),
> > TP_ARGS(state, event));
> > -#endif /* CONFIG_RV_MON_TSS */
> > +#endif /* CONFIG_RV_MON_STS */
>
> You are changing the tracepoint's name. Should we worry about
> breaking
> userspace?
>
Well, to be extremely picky, although that's what git shows, I'm not
changing tracepoints names, I'm removing tracepoints and adding similar
ones with different names.
That said, you're bringing a very good point, I guess removing/adding
monitors is going to be something quite common in the near future.
> It probably doesn't matter at the moment, because I doubt anyone is
> really
> relying on this tracepoint. But I think we should have a definite
> stance on
> this, for future references.
>
> I have seen tracepoints being changed (I know of [1][2][3], I was one
> of
> them :P), so it seems to be considered okay. But adding userspace
> tools to
> the equation and it doesn't make sense to me. For example, lttng is
> using
> the page_fault tracepoints [4], which is broken by [3].
>
> If this should be stable user API, then we should starting thinking
> about
> better API which allows changes like this to happen. Otherwise, they
> should
> be clearly documented to be unstable.
>
> (I think I may also need to change my rtapp's tracepoint names at
> some point
> in the future, that's why I am asking)
>
As you mentioned, nobody is likely relying on those tracepoints names
at the moment, but I would rather be cautious basing userspace tools on
some monitors to exist at all.
In my opinion, RV tracepoints are useful as an alternative of
reactors/rv userspace tool but cannot be used without considering the
RV interface itself (e.g. available_monitors and friends).
We could at least stick to the following assumptions:
1. monitors can change names, be added or removed
2. tracepoints are consistent to monitor names (in available_monitors)
3. the tracepoint structure does not change (i.e. event_/error_, args.)
(can change for new monitors types where seen fit)
If in the future we allow the possibility to build RV monitors as BPF
programs, we'd probably also allow monitors without tracepoints at all,
but I'd say for now those assumptions are sensible.
What do you think?
Thanks,
Gabriele
> Best regards,
> Nam
>
> [1] commit dbb6ecb328cb ("btrfs: tracepoints: simplify raid56
> events")
> [2] commit 244132c4e577 ("tracing/timers: Rename the hrtimer_init
> event to hrtimer_setup")
> [3]
> https://lore.kernel.org/lkml/2dda8c03-072a-43b2-af0c-bb996d64c388@cs.wisc.edu/#t
> [4]
> https://github.com/lttng/lttng-modules/blob/master/include/instrumentation/events/arch/x86/exceptions.h#L88C48-L88C63
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts
2025-06-24 14:44 ` Gabriele Monaco
@ 2025-06-24 15:50 ` Nam Cao
2025-06-24 19:31 ` Steven Rostedt
0 siblings, 1 reply; 38+ messages in thread
From: Nam Cao @ 2025-06-24 15:50 UTC (permalink / raw)
To: Gabriele Monaco
Cc: linux-kernel, Steven Rostedt, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc, Ingo Molnar, Peter Zijlstra,
Tomas Glozar, Juri Lelli, john.ogness
On Tue, Jun 24, 2025 at 04:44:49PM +0200, Gabriele Monaco wrote:
> As you mentioned, nobody is likely relying on those tracepoints names
> at the moment, but I would rather be cautious basing userspace tools on
> some monitors to exist at all.
>
> In my opinion, RV tracepoints are useful as an alternative of
> reactors/rv userspace tool but cannot be used without considering the
> RV interface itself (e.g. available_monitors and friends).
>
> We could at least stick to the following assumptions:
> 1. monitors can change names, be added or removed
> 2. tracepoints are consistent to monitor names (in available_monitors)
> 3. the tracepoint structure does not change (i.e. event_/error_, args.)
> (can change for new monitors types where seen fit)
>
> If in the future we allow the possibility to build RV monitors as BPF
> programs, we'd probably also allow monitors without tracepoints at all,
> but I'd say for now those assumptions are sensible.
>
> What do you think?
I would like that. Ideally, the userspace tools only use tracepoints based
on available_monitors.
However, people may not do that, and just use tracepoints directly.
You could argue that those tools are not correctly designed. Therefore it
is their fault that the tools are broken after updating kernel.
On the other hand, there is this sentiment that we must never break
userspace.
I don't know enough to judge this. Maybe @Steven has something to add?
Best regards,
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts
2025-06-24 15:50 ` Nam Cao
@ 2025-06-24 19:31 ` Steven Rostedt
2025-06-27 15:02 ` Nam Cao
0 siblings, 1 reply; 38+ messages in thread
From: Steven Rostedt @ 2025-06-24 19:31 UTC (permalink / raw)
To: Nam Cao
Cc: Gabriele Monaco, linux-kernel, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc, Ingo Molnar, Peter Zijlstra,
Tomas Glozar, Juri Lelli, john.ogness
On Tue, 24 Jun 2025 17:50:53 +0200
Nam Cao <namcao@linutronix.de> wrote:
> I would like that. Ideally, the userspace tools only use tracepoints based
> on available_monitors.
>
> However, people may not do that, and just use tracepoints directly.
>
> You could argue that those tools are not correctly designed. Therefore it
> is their fault that the tools are broken after updating kernel.
>
> On the other hand, there is this sentiment that we must never break
> userspace.
>
> I don't know enough to judge this. Maybe @Steven has something to add?
So WRT tracepoints, it's the same as a tree falling in the woods[1].
If a user space ABI "breaks" but no user space tooling notices, did it
really break?
The answer is "No".
As for tracepoints, its fine to change them until it's not ;-)
We had only one case that a tracepoint change broke user space where
Linus reverted that change [2]. That was because powertop hard coded
the addresses to the tracepoint field offsets and didn't use the format
files (what libtraceevent gives you). And I removed an unused common
field, which shifted everything and broke powertop.
But I converted powertop to use libtraceevent, waited a few year until
all the major distros provided the new powertop, and then I removed the
field. Guess what? Nobody noticed. (And old powertop would still break).
BPF taps into most tracepoints that change all the time. I'm cleaning
up unused tracepoints which included a couple that were left around to
"not break old BPF programs". I replied, if an old BPF program relies on
that tracepoint, keeping it around (but not used) is *worse* than
having that BPF program break. That's because that BPF program is
still broken (it's expecting that unused tracepoint to fire) but now
it's getting garbage for output (that being no output!). It's worse
because it's "silently failing" and the user may be relying on
something they don't know is broken.
So yeah, change the tracepoint when the code its tracing changes. That
way any tooling depending on it, knows that it can no longer depend on
it.
Anything using tracepoints are pretty much tied to the kernel anyway,
and when the kernel updates, the tooling that is relying on it also
needs to be updated otherwise it's not getting the information it is
expecting. That most definitely includes monitors.
-- Steve
[1] https://en.wikipedia.org/wiki/If_a_tree_falls_in_a_forest_and_no_one_is_around_to_hear_it,_does_it_make_a_sound%3F
[2] https://lwn.net/Articles/442113/
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts
2025-06-24 19:31 ` Steven Rostedt
@ 2025-06-27 15:02 ` Nam Cao
0 siblings, 0 replies; 38+ messages in thread
From: Nam Cao @ 2025-06-27 15:02 UTC (permalink / raw)
To: Steven Rostedt
Cc: Gabriele Monaco, linux-kernel, Jonathan Corbet, Masami Hiramatsu,
linux-trace-kernel, linux-doc, Ingo Molnar, Peter Zijlstra,
Tomas Glozar, Juri Lelli, john.ogness
On Tue, Jun 24, 2025 at 03:31:25PM -0400, Steven Rostedt wrote:
> So WRT tracepoints, it's the same as a tree falling in the woods[1].
>
> If a user space ABI "breaks" but no user space tooling notices, did it
> really break?
>
> The answer is "No".
>
> As for tracepoints, its fine to change them until it's not ;-)
>
> We had only one case that a tracepoint change broke user space where
> Linus reverted that change [2]. That was because powertop hard coded
> the addresses to the tracepoint field offsets and didn't use the format
> files (what libtraceevent gives you). And I removed an unused common
> field, which shifted everything and broke powertop.
>
> But I converted powertop to use libtraceevent, waited a few year until
> all the major distros provided the new powertop, and then I removed the
> field. Guess what? Nobody noticed. (And old powertop would still break).
>
> BPF taps into most tracepoints that change all the time. I'm cleaning
> up unused tracepoints which included a couple that were left around to
> "not break old BPF programs". I replied, if an old BPF program relies on
> that tracepoint, keeping it around (but not used) is *worse* than
> having that BPF program break. That's because that BPF program is
> still broken (it's expecting that unused tracepoint to fire) but now
> it's getting garbage for output (that being no output!). It's worse
> because it's "silently failing" and the user may be relying on
> something they don't know is broken.
>
> So yeah, change the tracepoint when the code its tracing changes. That
> way any tooling depending on it, knows that it can no longer depend on
> it.
>
> Anything using tracepoints are pretty much tied to the kernel anyway,
> and when the kernel updates, the tooling that is relying on it also
> needs to be updated otherwise it's not getting the information it is
> expecting. That most definitely includes monitors.
Alright, thanks for sharing. That was an interesting discussion you had
back then.
Let me keep an eye out for any user tools based on RV, making sure they use
our API properly.
Best regards,
Nam
^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2025-06-27 15:02 UTC | newest]
Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-14 8:43 [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 01/12] tools/rv: Do not skip idle in trace Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 02/12] tools/rv: Stop gracefully also on SIGTERM Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 03/12] rv: Add da_handle_start_run_event_ to per-task monitors Gabriele Monaco
2025-05-19 8:02 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 04/12] rv: Remove trailing whitespace from tracepoint string Gabriele Monaco
2025-05-19 8:05 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 05/12] rv: Return init error when registering monitors Gabriele Monaco
2025-05-19 8:06 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 06/12] sched: Adapt sched tracepoints for RV task model Gabriele Monaco
2025-05-19 8:29 ` Nam Cao
2025-05-19 8:41 ` Gabriele Monaco
2025-05-19 8:43 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 07/12] rv: Adapt the sco monitor to the new set_state Gabriele Monaco
2025-05-19 8:42 ` Nam Cao
2025-05-19 9:04 ` Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 08/12] rv: Extend and adapt snroc model Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts Gabriele Monaco
2025-06-24 7:36 ` Nam Cao
2025-06-24 14:44 ` Gabriele Monaco
2025-06-24 15:50 ` Nam Cao
2025-06-24 19:31 ` Steven Rostedt
2025-06-27 15:02 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 10/12] rv: Retry when da monitor detects race conditions Gabriele Monaco
2025-05-19 9:06 ` Nam Cao
2025-05-19 10:28 ` Gabriele Monaco
2025-05-19 10:38 ` Nam Cao
2025-05-19 11:13 ` Gabriele Monaco
2025-05-21 6:58 ` Nam Cao
2025-05-14 8:43 ` [RFC PATCH v2 11/12] rv: Add nrp and sssw per-task monitors Gabriele Monaco
2025-05-14 8:43 ` [RFC PATCH v2 12/12] rv: Add opid per-cpu monitor Gabriele Monaco
2025-05-27 13:37 ` Nam Cao
2025-05-27 14:35 ` Gabriele Monaco
2025-05-27 14:50 ` Nam Cao
2025-05-28 11:27 ` Gabriele Monaco
2025-05-21 7:15 ` [RFC PATCH v2 00/12] rv: Add monitors to validate task switch Nam Cao
2025-05-21 7:31 ` Gabriele Monaco
2025-05-27 13:51 ` Nam Cao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).