Re: [RFC/PATCH] perf report: Support latency profiling in system-wide mode

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Namhyung Kim <namhyung@kernel.org>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org, Andi Kleen <ak@linux.intel.com>
Subject: Re: [RFC/PATCH] perf report: Support latency profiling in system-wide mode
Date: Mon, 5 May 2025 23:43:38 -0700	[thread overview]
Message-ID: <aBmvmmRKpeVd6aT3@google.com> (raw)
In-Reply-To: <CACT4Y+ameQFd3n=u+bjd+vKR6svShp3NNQzjsUo_UUBCZPzrBw@mail.gmail.com>

On Tue, May 06, 2025 at 07:55:25AM +0200, Dmitry Vyukov wrote:
> On Tue, 6 May 2025 at 07:30, Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hello,
> >
> > On Mon, May 05, 2025 at 10:08:17AM +0200, Dmitry Vyukov wrote:
> > > On Sat, 3 May 2025 at 02:36, Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > When it profile a target process (and its children), it's
> > > > straight-forward to track parallelism using sched-switch info.  The
> > > > parallelism is kept in machine-level in this case.
> > > >
> > > > But when it profile multiple processes like in the system-wide mode,
> > > > it might not be clear how to apply the (machine-level) parallelism to
> > > > different tasks.  That's why it disabled the latency profiling for
> > > > system-wide mode.
> > > >
> > > > But it should be able to track parallelism in each process and it'd
> > > > useful to profile latency issues in multi-threaded programs.  So this
> > > > patch tries to enable it.
> > > >
> > > > However using sched-switch info can be a problem since it may emit a lot
> > > > more data and more chances for losing data when perf cannot keep up with
> > > > it.
> > > >
> > > > Instead, it can maintain the current process for each CPU when it sees
> > > > samples.
> > >
> > > Interesting.
> > >
> > > Few questions:
> > > 1. Do we always see a CPU sample when a CPU becomes idle? Otherwise we
> > > will think that the last thread runs on that CPU for arbitrary long,
> > > when it's actually not.
> >
> > No, it's not guaranteed to have a sample for idle tasks.  So right, it
> > can mis-calculate the parallelism for the last task.  If we can emit
> > sched-switches only when it goes to the idle task, it'd be accurate.
> 
> Then I think the profile can be significantly off if the system wasn't
> ~100% loaded, right?

Yep, it can be.

> 
> > > 2. If yes, can we also lose that "terminating" even when a CPU becomes
> > > idle? If yes, then it looks equivalent to missing a context switch
> > > event.
> >
> > I'm not sure what you are asking.  When it lose some records because the
> > buffer is full, it'll see the task of the last sample on each CPU.
> > Maybe we want to reset the current task after PERF_RECORD_LOST.
> 
> This probably does not matter much if the answer to question 1 is No.
> 
> But what I was is the following:
> 
> let's say we have samples:
> Sample 1 for Pid 42 on Cpu 10
> Sample 2 for idle task on Cpu 10
> ... no samples for some time on Cpu 10 ...
> 
> When we process sample 2, we decrement the counter for running tasks
> for Pid 42, right.
> Now if sample 2 is lost, then we don't do decrement and the accounting
> becomes off.
> In a sense this is equivalent to the problem of losing context switch event.

Right.  But I think it's hard to be correct once it loses something.

> 
> 
> > > 3. Does this mode kick in even for non system-wide profiles (collected
> > > w/o context switch events)? If yes, do we properly understand when a
> > > thread stops running for such profiles? How do we do that? There won't
> > > be samples for idle/other tasks.
> >
> > For non system-wide profiles, the problem is that it cannot know when
> > the current task is scheduled out so that it can decrease the count of
> > parallelism.  So this approach cannot work and sched-switch info is
> > required.
> 
> Where does the patch check that this mode is used only for system-wide profiles?
> Is it that PERF_SAMPLE_CPU present only for system-wide profiles?

Basically yes, but you can use --sample-cpu to add it.

In util/evsel.c::evsel__config():

	if (target__has_cpu(&opts->target) || opts->sample_cpu)
		evsel__set_sample_bit(evsel, CPU);

Thanks,
Namhyung

next prev parent reply	other threads:[~2025-05-06  6:43 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-03  0:36 [RFC/PATCH] perf report: Support latency profiling in system-wide mode Namhyung Kim
2025-05-04  8:22 ` Ingo Molnar
2025-05-04 19:52   ` Namhyung Kim
2025-05-05  8:03     ` Dmitry Vyukov
2025-05-06 14:57       ` Arnaldo Carvalho de Melo
2025-05-06 16:58         ` Arnaldo Carvalho de Melo
2025-05-07  9:58           ` Dmitry Vyukov
2025-05-07 15:47             ` Arnaldo Carvalho de Melo
2025-05-07 23:51               ` Namhyung Kim
2025-05-05  8:08 ` Dmitry Vyukov
2025-05-06  5:30   ` Namhyung Kim
2025-05-06  5:55     ` Dmitry Vyukov
2025-05-06  6:43       ` Namhyung Kim [this message]
2025-05-06  6:46         ` Dmitry Vyukov
2025-05-06  7:09           ` Namhyung Kim
2025-05-06  7:40             ` Dmitry Vyukov
2025-05-07 23:43               ` Namhyung Kim
2025-05-08 12:24                 ` Dmitry Vyukov
2025-05-16 16:33                   ` Namhyung Kim
2025-05-19  6:00                     ` Dmitry Vyukov
2025-05-20  1:43                       ` Namhyung Kim
2025-05-20  6:45                         ` Dmitry Vyukov
2025-05-20 22:50                           ` Namhyung Kim
2025-05-21  7:30                             ` Dmitry Vyukov
2025-05-27  7:14                               ` Dmitry Vyukov
2025-05-28 18:38                                 ` Namhyung Kim
2025-05-30  5:50                                   ` Dmitry Vyukov
2025-05-30 22:05                                     ` Namhyung Kim
2025-05-31  6:31                                       ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBmvmmRKpeVd6aT3@google.com \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=dvyukov@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox