* [PATCH v4 0/2] sched_ext: Add trace point to sched_ext core events @ 2025-03-04 10:48 Changwoo Min 2025-03-04 10:48 ` [PATCH v4 1/2] sched_ext: Change the event type from u64 to s64 Changwoo Min 2025-03-04 10:49 ` [PATCH v4 2/2] sched_ext: Add trace point to track sched_ext core events Changwoo Min 0 siblings, 2 replies; 6+ messages in thread From: Changwoo Min @ 2025-03-04 10:48 UTC (permalink / raw) To: tj, void, arighi; +Cc: kernel-dev, linux-kernel, Changwoo Min Add tracing support to track sched_ext core events (/sched_ext/sched_ext_event) to debug and monitor sched_ext schedulers. Also, change the core event type from u64 to s64 to support negative event values. ChangeLog v3 -> v4: - Replace a missing __u64 in a tracepoint definition to __s64. ChangeLog v2 -> v3: - Change the type of @delta from __u64 to __s64 and make corresponding changes in scx_event_stats and scx_qmap.bpf.c. ChangeLog v1 -> v2: - Rename @added field to @delta for clarity. - Rename sched_ext_add_event to sched_ext_event. - Drop the @offset field to avoid the potential misuse of non-portable numbers. Changwoo Min (2): sched_ext: Change the event type from u64 to s64 sched_ext: Add trace point to track sched_ext core events include/trace/events/sched_ext.h | 19 +++++++++++++++++++ kernel/sched/ext.c | 22 ++++++++++++---------- tools/sched_ext/scx_qmap.bpf.c | 16 ++++++++-------- 3 files changed, 39 insertions(+), 18 deletions(-) -- 2.48.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v4 1/2] sched_ext: Change the event type from u64 to s64 2025-03-04 10:48 [PATCH v4 0/2] sched_ext: Add trace point to sched_ext core events Changwoo Min @ 2025-03-04 10:48 ` Changwoo Min 2025-03-04 18:05 ` Tejun Heo 2025-03-04 10:49 ` [PATCH v4 2/2] sched_ext: Add trace point to track sched_ext core events Changwoo Min 1 sibling, 1 reply; 6+ messages in thread From: Changwoo Min @ 2025-03-04 10:48 UTC (permalink / raw) To: tj, void, arighi; +Cc: kernel-dev, linux-kernel, Changwoo Min The event count could be negative in the future, so change the event type from u64 to s64. Signed-off-by: Changwoo Min <changwoo@igalia.com> --- kernel/sched/ext.c | 20 ++++++++++---------- tools/sched_ext/scx_qmap.bpf.c | 16 ++++++++-------- 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 986b655911df..686629a860f3 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -1489,53 +1489,53 @@ struct scx_event_stats { * If ops.select_cpu() returns a CPU which can't be used by the task, * the core scheduler code silently picks a fallback CPU. */ - u64 SCX_EV_SELECT_CPU_FALLBACK; + s64 SCX_EV_SELECT_CPU_FALLBACK; /* * When dispatching to a local DSQ, the CPU may have gone offline in * the meantime. In this case, the task is bounced to the global DSQ. */ - u64 SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE; + s64 SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE; /* * If SCX_OPS_ENQ_LAST is not set, the number of times that a task * continued to run because there were no other tasks on the CPU. */ - u64 SCX_EV_DISPATCH_KEEP_LAST; + s64 SCX_EV_DISPATCH_KEEP_LAST; /* * If SCX_OPS_ENQ_EXITING is not set, the number of times that a task * is dispatched to a local DSQ when exiting. */ - u64 SCX_EV_ENQ_SKIP_EXITING; + s64 SCX_EV_ENQ_SKIP_EXITING; /* * If SCX_OPS_ENQ_MIGRATION_DISABLED is not set, the number of times a * migration disabled task skips ops.enqueue() and is dispatched to its * local DSQ. */ - u64 SCX_EV_ENQ_SKIP_MIGRATION_DISABLED; + s64 SCX_EV_ENQ_SKIP_MIGRATION_DISABLED; /* * The total number of tasks enqueued (or pick_task-ed) with a * default time slice (SCX_SLICE_DFL). */ - u64 SCX_EV_ENQ_SLICE_DFL; + s64 SCX_EV_ENQ_SLICE_DFL; /* * The total duration of bypass modes in nanoseconds. */ - u64 SCX_EV_BYPASS_DURATION; + s64 SCX_EV_BYPASS_DURATION; /* * The number of tasks dispatched in the bypassing mode. */ - u64 SCX_EV_BYPASS_DISPATCH; + s64 SCX_EV_BYPASS_DISPATCH; /* * The number of times the bypassing mode has been activated. */ - u64 SCX_EV_BYPASS_ACTIVATE; + s64 SCX_EV_BYPASS_ACTIVATE; }; /* @@ -1584,7 +1584,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu); * @kind: a kind of event to dump */ #define scx_dump_event(s, events, kind) do { \ - dump_line(&(s), "%40s: %16llu", #kind, (events)->kind); \ + dump_line(&(s), "%40s: %16lld", #kind, (events)->kind); \ } while (0) diff --git a/tools/sched_ext/scx_qmap.bpf.c b/tools/sched_ext/scx_qmap.bpf.c index 45fd643d2ca0..26c40ca4f36c 100644 --- a/tools/sched_ext/scx_qmap.bpf.c +++ b/tools/sched_ext/scx_qmap.bpf.c @@ -776,21 +776,21 @@ static int monitor_timerfn(void *map, int *key, struct bpf_timer *timer) __COMPAT_scx_bpf_events(&events, sizeof(events)); - bpf_printk("%35s: %llu", "SCX_EV_SELECT_CPU_FALLBACK", + bpf_printk("%35s: %lld", "SCX_EV_SELECT_CPU_FALLBACK", scx_read_event(&events, SCX_EV_SELECT_CPU_FALLBACK)); - bpf_printk("%35s: %llu", "SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE", + bpf_printk("%35s: %lld", "SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE", scx_read_event(&events, SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE)); - bpf_printk("%35s: %llu", "SCX_EV_DISPATCH_KEEP_LAST", + bpf_printk("%35s: %lld", "SCX_EV_DISPATCH_KEEP_LAST", scx_read_event(&events, SCX_EV_DISPATCH_KEEP_LAST)); - bpf_printk("%35s: %llu", "SCX_EV_ENQ_SKIP_EXITING", + bpf_printk("%35s: %lld", "SCX_EV_ENQ_SKIP_EXITING", scx_read_event(&events, SCX_EV_ENQ_SKIP_EXITING)); - bpf_printk("%35s: %llu", "SCX_EV_ENQ_SLICE_DFL", + bpf_printk("%35s: %lld", "SCX_EV_ENQ_SLICE_DFL", scx_read_event(&events, SCX_EV_ENQ_SLICE_DFL)); - bpf_printk("%35s: %llu", "SCX_EV_BYPASS_DURATION", + bpf_printk("%35s: %lld", "SCX_EV_BYPASS_DURATION", scx_read_event(&events, SCX_EV_BYPASS_DURATION)); - bpf_printk("%35s: %llu", "SCX_EV_BYPASS_DISPATCH", + bpf_printk("%35s: %lld", "SCX_EV_BYPASS_DISPATCH", scx_read_event(&events, SCX_EV_BYPASS_DISPATCH)); - bpf_printk("%35s: %llu", "SCX_EV_BYPASS_ACTIVATE", + bpf_printk("%35s: %lld", "SCX_EV_BYPASS_ACTIVATE", scx_read_event(&events, SCX_EV_BYPASS_ACTIVATE)); bpf_timer_start(timer, ONE_SEC_IN_NS, 0); -- 2.48.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v4 1/2] sched_ext: Change the event type from u64 to s64 2025-03-04 10:48 ` [PATCH v4 1/2] sched_ext: Change the event type from u64 to s64 Changwoo Min @ 2025-03-04 18:05 ` Tejun Heo 0 siblings, 0 replies; 6+ messages in thread From: Tejun Heo @ 2025-03-04 18:05 UTC (permalink / raw) To: Changwoo Min; +Cc: void, arighi, kernel-dev, linux-kernel On Tue, Mar 04, 2025 at 07:48:59PM +0900, Changwoo Min wrote: > The event count could be negative in the future, > so change the event type from u64 to s64. > > Signed-off-by: Changwoo Min <changwoo@igalia.com> Applied to sched_ext/for-6.15. Thanks. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v4 2/2] sched_ext: Add trace point to track sched_ext core events 2025-03-04 10:48 [PATCH v4 0/2] sched_ext: Add trace point to sched_ext core events Changwoo Min 2025-03-04 10:48 ` [PATCH v4 1/2] sched_ext: Change the event type from u64 to s64 Changwoo Min @ 2025-03-04 10:49 ` Changwoo Min 2025-03-04 12:20 ` Andrea Righi 1 sibling, 1 reply; 6+ messages in thread From: Changwoo Min @ 2025-03-04 10:49 UTC (permalink / raw) To: tj, void, arighi; +Cc: kernel-dev, linux-kernel, Changwoo Min Add tracing support to track sched_ext core events (/sched_ext/sched_ext_event). This may be useful for debugging sched_ext schedulers that trigger a particular event. The trace point can be used as other trace points, so it can be used in, for example, `perf trace` and BPF programs, as follows: ====== $> sudo perf trace -e sched_ext:sched_ext_event --filter 'name == "SCX_EV_ENQ_SLICE_DFL"' ====== ====== struct tp_sched_ext_event { struct trace_entry ent; u32 __data_loc_name; s64 delta; }; SEC("tracepoint/sched_ext/sched_ext_event") int rtp_add_event(struct tp_sched_ext_event *ctx) { char event_name[128]; unsigned short offset = ctx->__data_loc_name & 0xFFFF; bpf_probe_read_str((void *)event_name, 128, (char *)ctx + offset); bpf_printk("name %s delta %lld", event_name, ctx->delta); return 0; } ====== Signed-off-by: Changwoo Min <changwoo@igalia.com> --- include/trace/events/sched_ext.h | 19 +++++++++++++++++++ kernel/sched/ext.c | 2 ++ 2 files changed, 21 insertions(+) diff --git a/include/trace/events/sched_ext.h b/include/trace/events/sched_ext.h index fe19da7315a9..50e4b712735a 100644 --- a/include/trace/events/sched_ext.h +++ b/include/trace/events/sched_ext.h @@ -26,6 +26,25 @@ TRACE_EVENT(sched_ext_dump, ) ); +TRACE_EVENT(sched_ext_event, + TP_PROTO(const char *name, __s64 delta), + TP_ARGS(name, delta), + + TP_STRUCT__entry( + __string(name, name) + __field( __s64, delta ) + ), + + TP_fast_assign( + __assign_str(name); + __entry->delta = delta; + ), + + TP_printk("name %s delta %lld", + __get_str(name), __entry->delta + ) +); + #endif /* _TRACE_SCHED_EXT_H */ /* This part must be outside protection */ diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 686629a860f3..debcd1cf2de9 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -1554,6 +1554,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu); */ #define scx_add_event(name, cnt) do { \ this_cpu_add(event_stats_cpu.name, cnt); \ + trace_sched_ext_event(#name, cnt); \ } while(0) /** @@ -1565,6 +1566,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu); */ #define __scx_add_event(name, cnt) do { \ __this_cpu_add(event_stats_cpu.name, cnt); \ + trace_sched_ext_event(#name, cnt); \ } while(0) /** -- 2.48.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v4 2/2] sched_ext: Add trace point to track sched_ext core events 2025-03-04 10:49 ` [PATCH v4 2/2] sched_ext: Add trace point to track sched_ext core events Changwoo Min @ 2025-03-04 12:20 ` Andrea Righi 2025-03-04 18:08 ` Tejun Heo 0 siblings, 1 reply; 6+ messages in thread From: Andrea Righi @ 2025-03-04 12:20 UTC (permalink / raw) To: Changwoo Min; +Cc: tj, void, kernel-dev, linux-kernel On Tue, Mar 04, 2025 at 07:49:00PM +0900, Changwoo Min wrote: > Add tracing support to track sched_ext core events > (/sched_ext/sched_ext_event). This may be useful for debugging sched_ext > schedulers that trigger a particular event. > > The trace point can be used as other trace points, so it can be used in, > for example, `perf trace` and BPF programs, as follows: > > ====== > $> sudo perf trace -e sched_ext:sched_ext_event --filter 'name == "SCX_EV_ENQ_SLICE_DFL"' > ====== > > ====== > struct tp_sched_ext_event { > struct trace_entry ent; > u32 __data_loc_name; > s64 delta; > }; > > SEC("tracepoint/sched_ext/sched_ext_event") > int rtp_add_event(struct tp_sched_ext_event *ctx) > { > char event_name[128]; > unsigned short offset = ctx->__data_loc_name & 0xFFFF; > bpf_probe_read_str((void *)event_name, 128, (char *)ctx + offset); > > bpf_printk("name %s delta %lld", event_name, ctx->delta); > return 0; > } > ====== > > Signed-off-by: Changwoo Min <changwoo@igalia.com> > --- > include/trace/events/sched_ext.h | 19 +++++++++++++++++++ > kernel/sched/ext.c | 2 ++ > 2 files changed, 21 insertions(+) > > diff --git a/include/trace/events/sched_ext.h b/include/trace/events/sched_ext.h > index fe19da7315a9..50e4b712735a 100644 > --- a/include/trace/events/sched_ext.h > +++ b/include/trace/events/sched_ext.h > @@ -26,6 +26,25 @@ TRACE_EVENT(sched_ext_dump, > ) > ); > > +TRACE_EVENT(sched_ext_event, > + TP_PROTO(const char *name, __s64 delta), > + TP_ARGS(name, delta), > + > + TP_STRUCT__entry( > + __string(name, name) > + __field( __s64, delta ) nit: there's an extra space/tab after delta. But apart than that LGTM. Acked-by: Andrea Righi <arighi@nvidia.com> -Andrea > + ), > + > + TP_fast_assign( > + __assign_str(name); > + __entry->delta = delta; > + ), > + > + TP_printk("name %s delta %lld", > + __get_str(name), __entry->delta > + ) > +); > + > #endif /* _TRACE_SCHED_EXT_H */ > > /* This part must be outside protection */ > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c > index 686629a860f3..debcd1cf2de9 100644 > --- a/kernel/sched/ext.c > +++ b/kernel/sched/ext.c > @@ -1554,6 +1554,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu); > */ > #define scx_add_event(name, cnt) do { \ > this_cpu_add(event_stats_cpu.name, cnt); \ > + trace_sched_ext_event(#name, cnt); \ > } while(0) > > /** > @@ -1565,6 +1566,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu); > */ > #define __scx_add_event(name, cnt) do { \ > __this_cpu_add(event_stats_cpu.name, cnt); \ > + trace_sched_ext_event(#name, cnt); \ > } while(0) > > /** > -- > 2.48.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4 2/2] sched_ext: Add trace point to track sched_ext core events 2025-03-04 12:20 ` Andrea Righi @ 2025-03-04 18:08 ` Tejun Heo 0 siblings, 0 replies; 6+ messages in thread From: Tejun Heo @ 2025-03-04 18:08 UTC (permalink / raw) To: Andrea Righi; +Cc: Changwoo Min, void, kernel-dev, linux-kernel On Tue, Mar 04, 2025 at 01:20:17PM +0100, Andrea Righi wrote: > On Tue, Mar 04, 2025 at 07:49:00PM +0900, Changwoo Min wrote: > > Add tracing support to track sched_ext core events > > (/sched_ext/sched_ext_event). This may be useful for debugging sched_ext > > schedulers that trigger a particular event. > > > > The trace point can be used as other trace points, so it can be used in, > > for example, `perf trace` and BPF programs, as follows: > > > > ====== > > $> sudo perf trace -e sched_ext:sched_ext_event --filter 'name == "SCX_EV_ENQ_SLICE_DFL"' > > ====== > > > > ====== > > struct tp_sched_ext_event { > > struct trace_entry ent; > > u32 __data_loc_name; > > s64 delta; > > }; > > > > SEC("tracepoint/sched_ext/sched_ext_event") > > int rtp_add_event(struct tp_sched_ext_event *ctx) > > { > > char event_name[128]; > > unsigned short offset = ctx->__data_loc_name & 0xFFFF; > > bpf_probe_read_str((void *)event_name, 128, (char *)ctx + offset); > > > > bpf_printk("name %s delta %lld", event_name, ctx->delta); > > return 0; > > } > > ====== > > > > Signed-off-by: Changwoo Min <changwoo@igalia.com> > > --- > > include/trace/events/sched_ext.h | 19 +++++++++++++++++++ > > kernel/sched/ext.c | 2 ++ > > 2 files changed, 21 insertions(+) > > > > diff --git a/include/trace/events/sched_ext.h b/include/trace/events/sched_ext.h > > index fe19da7315a9..50e4b712735a 100644 > > --- a/include/trace/events/sched_ext.h > > +++ b/include/trace/events/sched_ext.h > > @@ -26,6 +26,25 @@ TRACE_EVENT(sched_ext_dump, > > ) > > ); > > > > +TRACE_EVENT(sched_ext_event, > > + TP_PROTO(const char *name, __s64 delta), > > + TP_ARGS(name, delta), > > + > > + TP_STRUCT__entry( > > + __string(name, name) > > + __field( __s64, delta ) > > nit: there's an extra space/tab after delta. I think it's one of common formatting styles for tp definitions. If we don't like it, we can just them in the future. > But apart than that LGTM. > > Acked-by: Andrea Righi <arighi@nvidia.com> Applied to sched_ext/for-6.15. Thanks. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-03-04 18:08 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-03-04 10:48 [PATCH v4 0/2] sched_ext: Add trace point to sched_ext core events Changwoo Min 2025-03-04 10:48 ` [PATCH v4 1/2] sched_ext: Change the event type from u64 to s64 Changwoo Min 2025-03-04 18:05 ` Tejun Heo 2025-03-04 10:49 ` [PATCH v4 2/2] sched_ext: Add trace point to track sched_ext core events Changwoo Min 2025-03-04 12:20 ` Andrea Righi 2025-03-04 18:08 ` Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox