linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@kernel.org>,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ian Rogers <irogers@google.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jiri Olsa <jolsa@kernel.org>,
	Douglas Raillard <douglas.raillard@arm.com>
Subject: Re: [POC][RFC][PATCH 0/3] tracing: Add perf events to trace buffer
Date: Tue, 18 Nov 2025 11:24:52 -0500	[thread overview]
Message-ID: <20251118112452.61c7de68@gandalf.local.home> (raw)
In-Reply-To: <aRwfhIT4pJ0pbY2k@google.com>

On Mon, 17 Nov 2025 23:25:56 -0800
Namhyung Kim <namhyung@kernel.org> wrote:

> > As for the perf event that is triggered. It currently is a dynamic array of
> > 64 bit values. Each value is broken up into 8 bits for what type of perf
> > event it is, and 56 bits for the counter. It only writes a per CPU raw
> > counter and does not do any math. That would be needed to be done by any
> > post processing.  
> 
> If you want to keep the perf events per CPU, you may consider CPU
> migrations for the func-graph case.  Otherwise userspace may not
> calculate the diff from the begining correctly.

That's easily solved by the user space too adding a sched_switch perf event
trigger. ;-)


> 
> Just FYI, I did the similar thing (like fgraph case) in uftrace and I
> grouped two related events to produce a metric.
> 
>   $ uftrace -T a@read=pmu-cycle ~/tmp/abc
>   # DURATION     TID      FUNCTION
>               [ 521741] | main() {
>               [ 521741] |   a() {
>               [ 521741] |     /* read:pmu-cycle (cycles=482 instructions=38) */
>               [ 521741] |     b() {
>               [ 521741] |       c() {
>      0.659 us [ 521741] |         getpid();
>      1.600 us [ 521741] |       } /* c */
>      1.780 us [ 521741] |     } /* b */
>               [ 521741] |     /* diff:pmu-cycle (cycles=+7361 instructions=+3955 IPC=0.54) */
>     24.485 us [ 521741] |   } /* a */
>     34.797 us [ 521741] | } /* main */
> 
> It reads cycles and instructions events (specified by 'pmu-cycle') at
> entry and exit of the given function ('a') and shows the diff with the
> metric IPC.

I originally tried to implement this, but then it became more complex than
I wanted in the kernel. As then I need to add a hook in the sched_switch
and record the perf event counter there, and keep track of it for every
task. That would require memory to be saved somewhere. I started adding it
to the function graph shadow stack and then just decided that it would be
so much easier to let user space figure it out.

By running function graph tracer and showing the start and end counters, as
well as the counters at the sched_switch trace event, user space could do
all the math and accounting, and the code in the kernel can remain simple.

-- Steve

      reply	other threads:[~2025-11-18 16:24 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-18  0:29 [POC][RFC][PATCH 0/3] tracing: Add perf events to trace buffer Steven Rostedt
2025-11-18  0:29 ` [POC][RFC][PATCH 1/3] tracing: Add perf events Steven Rostedt
2025-11-18  8:35   ` Peter Zijlstra
2025-11-18 13:42     ` Steven Rostedt
2025-11-18 20:24       ` Steven Rostedt
2025-11-18  0:29 ` [POC][RFC][PATCH 2/3] ftrace: Add perf counters to function tracing Steven Rostedt
2025-11-18  0:29 ` [POC][RFC][PATCH 3/3] fgraph: Add perf counters to function graph tracer Steven Rostedt
2025-11-18  3:08 ` [POC][RFC][PATCH 0/3] tracing: Add perf events to trace buffer Masami Hiramatsu
2025-11-18  3:42   ` Steven Rostedt
2025-11-18  8:11     ` Masami Hiramatsu
2025-11-18 13:53       ` Steven Rostedt
2025-11-18 13:57         ` Steven Rostedt
2025-11-18 16:31       ` Steven Rostedt
2025-11-18  7:25 ` Namhyung Kim
2025-11-18 16:24   ` Steven Rostedt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251118112452.61c7de68@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=douglas.raillard@arm.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).