linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Kan Liang <kan.liang@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Indu Bhagat <indu.bhagat@oracle.com>,
	linux-toolchains@vger.kernel.org
Subject: Re: [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2)
Date: Mon, 23 Sep 2024 16:07:26 -0700	[thread overview]
Message-ID: <ZvH0rq-J3_ixe3wU@google.com> (raw)
In-Reply-To: <CAP-5=fUT-60ff-hPS9Anizq9XxiqAOJXYyQqeVoHBYzMvd0L=g@mail.gmail.com>

Hi Ian,

On Wed, Sep 18, 2024 at 03:39:31PM +0200, Ian Rogers wrote:
> On Wed, Sep 18, 2024 at 11:38 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Wed, Sep 18, 2024 at 08:38:22AM +0200, Ian Rogers wrote:
> > > On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > Hello,
> > > >
> > > > This is a counterpart for Josh's kernel change v2 [1] to support deferred
> > > > user callchains.  The change is transparent and users should not notice
> > > > anything with the deferred callchains.
> > > >
> > > >   $ perf record -g sleep 1
> > > >
> > > > I added --[no-]merge-callchains option to control output of perf script.
> > > > You can verify it has the deferred callchains like this:
> > > >
> > > >   $ perf script --no-merge-callchains
> > > >   perf     801 [000]    18.031793:          1 cycles:P:
> > > >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> > > >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> > > >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> > > >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> > > >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> > > >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> > > >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> > > >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> > > >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> > > >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> > > >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> > > >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> > > >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> > > >
> > > >   perf     801 [000]    18.031814: DEFERRED CALLCHAIN
> > > >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > > >
> > > >   ...
> > > >
> > > > When the callchain is merged (it's the default) it'd look like below:
> > > >
> > > >   $ perf script
> > > >   perf     801 [000]    18.031793:          1 cycles:P:
> > > >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> > > >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> > > >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> > > >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> > > >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> > > >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> > > >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> > > >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> > > >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> > > >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> > > >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> > > >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> > > >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> > > >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > > >
> > > >   ...
> > > >
> > > > Notice that the last line and it has the __GI___ioctl in the same
> > > > callchain.  It should work with other tools like perf report.
> > >
> > > Hi Namhyung, I think this is interesting work!
> > >
> > > The issue feels similar to leader sampling and some of the unpicking
> > > of that we've been dealing with. With leader sampling it was added and
> > > then the dispatch of events modified so that tools wouldn't see leader
> > > samples, instead new events would be synthesized based on the leader
> > > sample data. However, the leader sample event wasn't changed and so
> > > now we have multiple repeated events and perf inject wouldn't just
> > > pass through a perf data file.
> > >
> > > What I'm expecting based on this description is that a deferred call
> > > chain will be merged with a regular one, however, perf inject isn't
> > > updated to drop the deferred callchain so now we have the deferred
> > > callchain event twice.
> > >
> > > My feeling is that making the dispatch of events to tools "smart" is a
> > > false economy. Tools can add handlers for these events easily enough.
> > > What's harder is undoing the smartness when it does things that lead
> > > to duplicated events and the like. I'm not a fan of how leader
> > > sampling was implemented and I still think it odd that with perf
> > > script we see invented events when trying to just dump the contents of
> > > a perf.data file.
> >
> > That's why I added perf_tool.merge_deferred_callchains flag to control
> > the behavior.  I haven't implemented it to perf inject because it covers
> > a couple of different use cases.  I believe the default behavior is to
> > not invoke the callback for deferred callchains during perf inject and
> > each sample will get the full callchains.  But you can add a new
> > callback and set perf_tool.merge_deferred_callchains to false.
> 
> I wonder if there is a different strategy for handling this. Normally
> with a visitor pattern you fail when you call an unimplemented
> visitor, this is then a signal the (in our case) tool needs to handle
> the new case. This avoids naively doing things like making perf inject
> duplicate events. The equivalent in the perf code would be to
> initialize the callbacks in the tool constructor to be to stubs that
> abort, then explicitly initialize and use things like callchain
> merging as appropriate. The whole booleans next to the callbacks feels
> like a kludge and likely to hide bugs. It is also marginally less
> efficient.

Well.. we might change it that way later, but I just wanted to test the
deferred callchains quickly in this series.

Thanks,
Namhyung


  reply	other threads:[~2024-09-23 23:07 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-17 22:28 [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Namhyung Kim
2024-09-17 22:28 ` [PATCH 1/5] perf tools: Sync UAPI perf_event.h header Namhyung Kim
2024-09-17 22:28 ` [PATCH 2/5] perf tools: Minimal DEFERRED_CALLCHAIN support Namhyung Kim
2024-09-17 22:28 ` [PATCH 3/5] perf record: Enable defer_callchain for user callchains Namhyung Kim
2024-09-17 22:28 ` [PATCH 4/5] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED Namhyung Kim
2024-09-17 22:28 ` [PATCH 5/5] perf tools: Merge deferred user callchains Namhyung Kim
2024-09-18  6:38 ` [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains (v2) Ian Rogers
2024-09-18  9:38   ` Namhyung Kim
2024-09-18 13:39     ` Ian Rogers
2024-09-23 23:07       ` Namhyung Kim [this message]
2024-09-18 20:26 ` Liang, Kan
2024-09-23 23:08   ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZvH0rq-J3_ixe3wU@google.com \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=indu.bhagat@oracle.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).