* [PATCH v2] sched_ext: Add trace point to track sched_ext core events
@ 2025-02-28 8:59 Changwoo Min
2025-02-28 10:03 ` Andrea Righi
0 siblings, 1 reply; 6+ messages in thread
From: Changwoo Min @ 2025-02-28 8:59 UTC (permalink / raw)
To: tj, void, arighi; +Cc: kernel-dev, linux-kernel, Changwoo Min
Add tracing support to track sched_ext core events
(/sched_ext/sched_ext_event). This may be useful for debugging sched_ext
schedulers that trigger a particular event.
The trace point can be used as other trace points, so it can be used in,
for example, `perf trace` and BPF programs, as follows:
======
$> sudo perf trace -e sched_ext:sched_ext_event --filter 'name == "SCX_EV_ENQ_SLICE_DFL"'
======
======
struct tp_sched_ext_event {
struct trace_entry ent;
u32 __data_loc_name;
u64 delta;
};
SEC("tracepoint/sched_ext/sched_ext_event")
int rtp_add_event(struct tp_sched_ext_event *ctx)
{
char event_name[128];
unsigned short offset = ctx->__data_loc_name & 0xFFFF;
bpf_probe_read_str((void *)event_name, 128, (char *)ctx + offset);
bpf_printk("name %s delta %llu", event_name, ctx->delta);
return 0;
}
======
Signed-off-by: Changwoo Min <changwoo@igalia.com>
---
ChangeLog v1 -> v2:
- Rename @added field to @delta for clarity.
- Rename sched_ext_add_event to sched_ext_event.
- Drop the @offset field to avoid the potential misuse of non-portable numbers.
include/trace/events/sched_ext.h | 19 +++++++++++++++++++
kernel/sched/ext.c | 2 ++
2 files changed, 21 insertions(+)
diff --git a/include/trace/events/sched_ext.h b/include/trace/events/sched_ext.h
index fe19da7315a9..b73499981682 100644
--- a/include/trace/events/sched_ext.h
+++ b/include/trace/events/sched_ext.h
@@ -26,6 +26,25 @@ TRACE_EVENT(sched_ext_dump,
)
);
+TRACE_EVENT(sched_ext_event,
+ TP_PROTO(const char *name, __u64 delta),
+ TP_ARGS(name, delta),
+
+ TP_STRUCT__entry(
+ __string(name, name)
+ __field( __u64, delta )
+ ),
+
+ TP_fast_assign(
+ __assign_str(name);
+ __entry->delta = delta;
+ ),
+
+ TP_printk("name %s delta %llu",
+ __get_str(name), __entry->delta
+ )
+);
+
#endif /* _TRACE_SCHED_EXT_H */
/* This part must be outside protection */
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 986b655911df..53729c584b63 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -1554,6 +1554,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu);
*/
#define scx_add_event(name, cnt) do { \
this_cpu_add(event_stats_cpu.name, cnt); \
+ trace_sched_ext_event(#name, cnt); \
} while(0)
/**
@@ -1565,6 +1566,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu);
*/
#define __scx_add_event(name, cnt) do { \
__this_cpu_add(event_stats_cpu.name, cnt); \
+ trace_sched_ext_event(#name, cnt); \
} while(0)
/**
--
2.48.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] sched_ext: Add trace point to track sched_ext core events
2025-02-28 8:59 [PATCH v2] sched_ext: Add trace point to track sched_ext core events Changwoo Min
@ 2025-02-28 10:03 ` Andrea Righi
2025-02-28 17:31 ` Tejun Heo
0 siblings, 1 reply; 6+ messages in thread
From: Andrea Righi @ 2025-02-28 10:03 UTC (permalink / raw)
To: Changwoo Min; +Cc: tj, void, kernel-dev, linux-kernel
On Fri, Feb 28, 2025 at 05:59:44PM +0900, Changwoo Min wrote:
> Add tracing support to track sched_ext core events
> (/sched_ext/sched_ext_event). This may be useful for debugging sched_ext
> schedulers that trigger a particular event.
>
> The trace point can be used as other trace points, so it can be used in,
> for example, `perf trace` and BPF programs, as follows:
>
> ======
> $> sudo perf trace -e sched_ext:sched_ext_event --filter 'name == "SCX_EV_ENQ_SLICE_DFL"'
> ======
>
> ======
> struct tp_sched_ext_event {
> struct trace_entry ent;
> u32 __data_loc_name;
> u64 delta;
> };
>
> SEC("tracepoint/sched_ext/sched_ext_event")
> int rtp_add_event(struct tp_sched_ext_event *ctx)
> {
> char event_name[128];
> unsigned short offset = ctx->__data_loc_name & 0xFFFF;
> bpf_probe_read_str((void *)event_name, 128, (char *)ctx + offset);
>
> bpf_printk("name %s delta %llu", event_name, ctx->delta);
> return 0;
> }
> ======
>
> Signed-off-by: Changwoo Min <changwoo@igalia.com>
> ---
>
> ChangeLog v1 -> v2:
> - Rename @added field to @delta for clarity.
> - Rename sched_ext_add_event to sched_ext_event.
> - Drop the @offset field to avoid the potential misuse of non-portable numbers.
>
> include/trace/events/sched_ext.h | 19 +++++++++++++++++++
> kernel/sched/ext.c | 2 ++
> 2 files changed, 21 insertions(+)
>
> diff --git a/include/trace/events/sched_ext.h b/include/trace/events/sched_ext.h
> index fe19da7315a9..b73499981682 100644
> --- a/include/trace/events/sched_ext.h
> +++ b/include/trace/events/sched_ext.h
> @@ -26,6 +26,25 @@ TRACE_EVENT(sched_ext_dump,
> )
> );
>
> +TRACE_EVENT(sched_ext_event,
> + TP_PROTO(const char *name, __u64 delta),
> + TP_ARGS(name, delta),
> +
> + TP_STRUCT__entry(
> + __string(name, name)
> + __field( __u64, delta )
I'm wondering if we should use a __s64 here (and %lld below). We don't have
negative deltas right now, but in the future who knows...
Apart than that, everything else looks good to me.
Thanks,
-Andrea
> + ),
> +
> + TP_fast_assign(
> + __assign_str(name);
> + __entry->delta = delta;
> + ),
> +
> + TP_printk("name %s delta %llu",
> + __get_str(name), __entry->delta
> + )
> +);
> +
> #endif /* _TRACE_SCHED_EXT_H */
>
> /* This part must be outside protection */
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 986b655911df..53729c584b63 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -1554,6 +1554,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu);
> */
> #define scx_add_event(name, cnt) do { \
> this_cpu_add(event_stats_cpu.name, cnt); \
> + trace_sched_ext_event(#name, cnt); \
> } while(0)
>
> /**
> @@ -1565,6 +1566,7 @@ static DEFINE_PER_CPU(struct scx_event_stats, event_stats_cpu);
> */
> #define __scx_add_event(name, cnt) do { \
> __this_cpu_add(event_stats_cpu.name, cnt); \
> + trace_sched_ext_event(#name, cnt); \
> } while(0)
>
> /**
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] sched_ext: Add trace point to track sched_ext core events
2025-02-28 10:03 ` Andrea Righi
@ 2025-02-28 17:31 ` Tejun Heo
2025-02-28 23:33 ` Changwoo Min
0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2025-02-28 17:31 UTC (permalink / raw)
To: Andrea Righi; +Cc: Changwoo Min, void, kernel-dev, linux-kernel
On Fri, Feb 28, 2025 at 11:03:54AM +0100, Andrea Righi wrote:
> > +TRACE_EVENT(sched_ext_event,
> > + TP_PROTO(const char *name, __u64 delta),
> > + TP_ARGS(name, delta),
> > +
> > + TP_STRUCT__entry(
> > + __string(name, name)
> > + __field( __u64, delta )
>
> I'm wondering if we should use a __s64 here (and %lld below). We don't have
> negative deltas right now, but in the future who knows...
>
> Apart than that, everything else looks good to me.
And let's also print out the updated value.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] sched_ext: Add trace point to track sched_ext core events
2025-02-28 17:31 ` Tejun Heo
@ 2025-02-28 23:33 ` Changwoo Min
2025-02-28 23:50 ` Tejun Heo
0 siblings, 1 reply; 6+ messages in thread
From: Changwoo Min @ 2025-02-28 23:33 UTC (permalink / raw)
To: Tejun Heo, Andrea Righi; +Cc: void, kernel-dev, linux-kernel
Hi Tejun and Andrea,
On 25. 3. 1. 02:31, Tejun Heo wrote:
> On Fri, Feb 28, 2025 at 11:03:54AM +0100, Andrea Righi wrote:
>>> +TRACE_EVENT(sched_ext_event,
>>> + TP_PROTO(const char *name, __u64 delta),
>>> + TP_ARGS(name, delta),
>>> +
>>> + TP_STRUCT__entry(
>>> + __string(name, name)
>>> + __field( __u64, delta )
>>
>> I'm wondering if we should use a __s64 here (and %lld below). We don't have
>> negative deltas right now, but in the future who knows...
That makes sense. I will change it to __s64 in the tracepoint.
Also, I will make corresponding changes (u64 -> s64) in other
places, including struct scx_event_stats.
>
> And let's also print out the updated value.
You might have two options here: 1) returning per-CPU event
counter or 2) returning aggregated event counter. The first opion
will be fast but less meaningful from user's point of view
compared to the second option. Assuming the tracepoint are not in
the hot path, I think the second option will be better choice.
I will add an @event field and a special version of
scx_bpf_events() for faster aggregation.
Thanks!
Changwoo Min
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] sched_ext: Add trace point to track sched_ext core events
2025-02-28 23:33 ` Changwoo Min
@ 2025-02-28 23:50 ` Tejun Heo
2025-03-01 0:46 ` Changwoo Min
0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2025-02-28 23:50 UTC (permalink / raw)
To: Changwoo Min; +Cc: Andrea Righi, void, kernel-dev, linux-kernel
Hello,
On Sat, Mar 01, 2025 at 08:33:36AM +0900, Changwoo Min wrote:
...
> You might have two options here: 1) returning per-CPU event
> counter or 2) returning aggregated event counter. The first opion
> will be fast but less meaningful from user's point of view
> compared to the second option. Assuming the tracepoint are not in
> the hot path, I think the second option will be better choice.
> I will add an @event field and a special version of
> scx_bpf_events() for faster aggregation.
Ah, right, let's forget about printing aggregate for now. That's not
difficult for anyone to find out if necessary after all.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] sched_ext: Add trace point to track sched_ext core events
2025-02-28 23:50 ` Tejun Heo
@ 2025-03-01 0:46 ` Changwoo Min
0 siblings, 0 replies; 6+ messages in thread
From: Changwoo Min @ 2025-03-01 0:46 UTC (permalink / raw)
To: Tejun Heo; +Cc: Andrea Righi, void, kernel-dev, linux-kernel
On 25. 3. 1. 08:50, Tejun Heo wrote:
> Hello,
>
> On Sat, Mar 01, 2025 at 08:33:36AM +0900, Changwoo Min wrote:
> ...
>> You might have two options here: 1) returning per-CPU event
>> counter or 2) returning aggregated event counter. The first opion
>> will be fast but less meaningful from user's point of view
>> compared to the second option. Assuming the tracepoint are not in
>> the hot path, I think the second option will be better choice.
>> I will add an @event field and a special version of
>> scx_bpf_events() for faster aggregation.
>
> Ah, right, let's forget about printing aggregate for now. That's not
> difficult for anyone to find out if necessary after all.
Okay. I will send a new version with __s64 changes.
Thanks!
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-03-01 0:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-28 8:59 [PATCH v2] sched_ext: Add trace point to track sched_ext core events Changwoo Min
2025-02-28 10:03 ` Andrea Righi
2025-02-28 17:31 ` Tejun Heo
2025-02-28 23:33 ` Changwoo Min
2025-02-28 23:50 ` Tejun Heo
2025-03-01 0:46 ` Changwoo Min
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.