All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Howard Chu <howardchu95@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org,
	kan.liang@linux.intel.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 0/5] Dump off-cpu samples directly
Date: Wed, 31 Jul 2024 15:23:33 -0300	[thread overview]
Message-ID: <ZqqBJbYAsLSSJAoX@x1> (raw)
In-Reply-To: <Zqp4aoqhPgW7AOQA@google.com>

On Wed, Jul 31, 2024 at 10:46:18AM -0700, Namhyung Kim wrote:
> On Mon, Jul 29, 2024 at 11:24:47PM +0800, Howard Chu wrote:
> > On Mon, Jul 29, 2024 at 9:21 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > On Fri, Jul 26, 2024 at 06:28:21PM +0800, Howard Chu wrote:
> > > > As mentioned in: https://bugzilla.kernel.org/show_bug.cgi?id=207323

> > > > Currently, off-cpu samples are dumped when perf record is exiting. This
> > > > results in off-cpu samples being after the regular samples. This patch
> > > > series makes possible dumping off-cpu samples on-the-fly, directly into
> > > > perf ring buffer. And it dispatches those samples to the correct format
> > > > for perf.data consumers.

> > > Thanks for your work!

> > > But I'm not sure we need a separate event for offcpu-time-direct.  If we
> > > fix the format for the direct event, we can adjust the format of offcpu-
> > > time when it dumps at the end.

> > Thank you and Ian for this advice, I'll do that.

> > > Anyway, as far as I can see you don't need to fill the sample info in
> > > the offcpu-time-direct manually in your BPF program.  Because the
> > > bpf_perf_event_output() will call perf_event_output() which fills all
> > > the sample information according to the sample_type flags.

> > > Well.. it'll set IP to the schedule function, but it should be ok.
> > > (updating IP using CALLCHAIN like in off_cpu_write() is a kinda hack and
> > > not absoluately necessary, probably I can get rid of it..  Let's go with
> > > simple for now.)

> > > So I think what you need is to ensure it has the uncessary flags.  And
> > > the only info it needs to fill is the time between the previous schedule
> > > and this can be added to the raw data.

I wonder if there wouldn't be other kernel information about things that
may have affected the time it took for the task to be off-cpu, maybe
system load average, C/P states, but then it would be overengineering, I
think, just thought about what else we could add that could help
understanding the off-cpu time that could be obtained from the vantage
point of BPF attached to sched-out and sched-in, things we could collect
at sched-out in addition to the timestamp and ditto at sched-in, but I'm
no scheduler expert, so take this just as some brainstorming. Maybe we
could have some sort of sample_type specific to this off-cpu event that
would allow us to ask for extra info in an extensible way. We can start
with just PERF_OFFCPU_SAMPLE_TIMESTAMP...

> > Sure thing, thank you. Other than the off-cpu duration, do you think
> > we should collect the callchain as well?
 
> I think the kernel will do that for you once you set the
> SAMPLE_CALLCHAIN flag in the event attr.

Yes, we should not reinvent the wheel for all things that can be asked
from the kernel perf subsystem, only using the raw-data payload on the
bpf-output event for things we can't get from the perf subsystem, and
that is the timestamp for a previous event stored in a BPF map, looking
if the delta to the current time (on a sched-in event) is over the
threshold and then recording this time on this specific "made-up on the
fly using BPF" event that appears on the ring buffer just like any other
"native" events such as tracepoints, hardware events, cache events, etc.

- Arnaldo

      reply	other threads:[~2024-07-31 18:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-26 10:28 [PATCH v3 0/5] Dump off-cpu samples directly Howard Chu
2024-07-26 10:28 ` [PATCH v3 1/5] perf record off-cpu: Add direct off-cpu event Howard Chu
2024-07-26 23:48   ` Ian Rogers
2024-07-29 13:36     ` Howard Chu
2024-07-26 10:28 ` [PATCH v3 2/5] perf record off-cpu: Dumping samples in BPF Howard Chu
2024-07-26 10:28 ` [PATCH v3 3/5] perf record off-cpu: processing of embedded sample Howard Chu
2024-07-26 10:28 ` [PATCH v3 4/5] perf record off-cpu: save embedded sample type Howard Chu
2024-07-27  0:49   ` Ian Rogers
2024-07-26 10:28 ` [PATCH v3 5/5] perf record off-cpu: Add direct off-cpu test Howard Chu
2024-07-27  0:54   ` Ian Rogers
2024-07-29 13:29     ` Howard Chu
2024-07-27  1:06 ` [PATCH v3 0/5] Dump off-cpu samples directly Ian Rogers
2024-07-29  1:21 ` Namhyung Kim
2024-07-29 15:24   ` Howard Chu
2024-07-31 17:46     ` Namhyung Kim
2024-07-31 18:23       ` Arnaldo Carvalho de Melo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZqqBJbYAsLSSJAoX@x1 \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=howardchu95@gmail.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.