* [PATCH v2] perf record: Synthesize COMM event for a command line workload
@ 2015-09-22 0:24 Namhyung Kim
2015-09-22 19:25 ` Arnaldo Carvalho de Melo
2015-09-23 8:46 ` [tip:perf/core] " tip-bot for Namhyung Kim
0 siblings, 2 replies; 6+ messages in thread
From: Namhyung Kim @ 2015-09-22 0:24 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern
When perf creates a new child to profile, the events are enabled on
exec(). And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec(). But there's an window
between the enabling and the event generation.
It used to be overcome since samples are only in kernel (so we always
have the map) and the comm is overridden by a later COMM event.
However it won't work if events are processed and displayed before the
COMM event overrides like in 'perf script'. This leads to those early
samples (like native_write_msr_safe) not having a comm but pid (like
':15328').
So it needs to synthesize COMM event for the child explicitly before
enabling so that it can have a correct comm. But at this time, the
comm will be "perf" since it's not exec-ed yet.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-record.c | 16 +++++++++++++++-
tools/perf/util/event.c | 2 +-
tools/perf/util/event.h | 5 +++++
3 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 142eeb341b29..6d8302d4612f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -636,8 +636,22 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
/*
* Let the child rip
*/
- if (forks)
+ if (forks) {
+ union perf_event event;
+
+ /*
+ * Some H/W events are generated before COMM event
+ * which is emitted during exec(), so perf script
+ * cannot see a correct process name for those events.
+ * Synthesize COMM event to prevent it.
+ */
+ perf_event__synthesize_comm(tool, &event,
+ rec->evlist->workload.pid,
+ process_synthesized_event,
+ &session->machines.host);
+
perf_evlist__start_workload(rec->evlist);
+ }
if (opts->initial_delay) {
usleep(opts->initial_delay * 1000);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 497157affc9c..6214ad47d554 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -167,7 +167,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
return 0;
}
-static pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
union perf_event *event, pid_t pid,
perf_event__handler_t process,
struct machine *machine)
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index f729df5e25e6..be5cbc7be889 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -478,6 +478,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
const struct perf_sample *sample,
bool swapped);
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+ union perf_event *event, pid_t pid,
+ perf_event__handler_t process,
+ struct machine *machine);
+
int perf_event__synthesize_mmap_events(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid,
--
2.5.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] perf record: Synthesize COMM event for a command line workload
2015-09-22 0:24 [PATCH v2] perf record: Synthesize COMM event for a command line workload Namhyung Kim
@ 2015-09-22 19:25 ` Arnaldo Carvalho de Melo
2015-09-22 19:48 ` Arnaldo Carvalho de Melo
2015-09-23 8:46 ` [tip:perf/core] " tip-bot for Namhyung Kim
1 sibling, 1 reply; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-22 19:25 UTC (permalink / raw)
To: Namhyung Kim; +Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern
Em Tue, Sep 22, 2015 at 09:24:55AM +0900, Namhyung Kim escreveu:
> When perf creates a new child to profile, the events are enabled on
> exec(). And in this case, it doesn't synthesize any event for the
> child since they'll be generated during exec(). But there's an window
> between the enabling and the event generation.
>
> It used to be overcome since samples are only in kernel (so we always
> have the map) and the comm is overridden by a later COMM event.
> However it won't work if events are processed and displayed before the
> COMM event overrides like in 'perf script'. This leads to those early
> samples (like native_write_msr_safe) not having a comm but pid (like
> ':15328').
>
> So it needs to synthesize COMM event for the child explicitly before
> enabling so that it can have a correct comm. But at this time, the
> comm will be "perf" since it's not exec-ed yet.
Don't we use enable-on-exec?
# perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2:
1, comm_exec: 1
#
Ok, but even then, if we use --show-task-events, we see that the comm
appears only later...
# perf script --show-task-events
:4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429
usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4.
usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so)
usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429)
While with your patch we see:
# perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
# perf script --show-task-events
perf 0 0.000000: PERF_RECORD_COMM: perf:6571/6571
perf 6571 28378.609330: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 6571 28378.609334: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 6571 28378.609335: 7 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 6571 28378.609336: 163 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 6571 28378.609337: 4795 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 6571 28378.609340: 119283 cycles: ffffffff81189bdd perf_event_comm_output (/lib/modules/4
usleep 6571 28378.609340: PERF_RECORD_COMM exec: usleep:6571/6571
usleep 6571 28378.609380: 1900804 cycles: ffffffff810a0b40 find_next_iomem_res (/lib/modules/4.3.
usleep 6571 28378.609879: PERF_RECORD_EXIT(6571:6571):(6571:6571)
Better indeed, I'm adding this before/after so that the changeset comment
can fully illustrate what happens.
But you see there is still a problem with that synthesized COMM, right? the one
coming from the kernel has the 6571 there, while the synthesized one doesn't...
Checking...
But I'll apply your patch.
- Arnaldo
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
> tools/perf/builtin-record.c | 16 +++++++++++++++-
> tools/perf/util/event.c | 2 +-
> tools/perf/util/event.h | 5 +++++
> 3 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 142eeb341b29..6d8302d4612f 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -636,8 +636,22 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> /*
> * Let the child rip
> */
> - if (forks)
> + if (forks) {
> + union perf_event event;
> +
> + /*
> + * Some H/W events are generated before COMM event
> + * which is emitted during exec(), so perf script
> + * cannot see a correct process name for those events.
> + * Synthesize COMM event to prevent it.
> + */
> + perf_event__synthesize_comm(tool, &event,
> + rec->evlist->workload.pid,
> + process_synthesized_event,
> + &session->machines.host);
> +
> perf_evlist__start_workload(rec->evlist);
> + }
>
> if (opts->initial_delay) {
> usleep(opts->initial_delay * 1000);
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 497157affc9c..6214ad47d554 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -167,7 +167,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
> return 0;
> }
>
> -static pid_t perf_event__synthesize_comm(struct perf_tool *tool,
> +pid_t perf_event__synthesize_comm(struct perf_tool *tool,
> union perf_event *event, pid_t pid,
> perf_event__handler_t process,
> struct machine *machine)
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index f729df5e25e6..be5cbc7be889 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -478,6 +478,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
> const struct perf_sample *sample,
> bool swapped);
>
> +pid_t perf_event__synthesize_comm(struct perf_tool *tool,
> + union perf_event *event, pid_t pid,
> + perf_event__handler_t process,
> + struct machine *machine);
> +
> int perf_event__synthesize_mmap_events(struct perf_tool *tool,
> union perf_event *event,
> pid_t pid, pid_t tgid,
> --
> 2.5.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] perf record: Synthesize COMM event for a command line workload
2015-09-22 19:25 ` Arnaldo Carvalho de Melo
@ 2015-09-22 19:48 ` Arnaldo Carvalho de Melo
2015-09-23 2:49 ` Namhyung Kim
0 siblings, 1 reply; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-09-22 19:48 UTC (permalink / raw)
To: Namhyung Kim; +Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern
Em Tue, Sep 22, 2015 at 04:25:34PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Sep 22, 2015 at 09:24:55AM +0900, Namhyung Kim escreveu:
> > When perf creates a new child to profile, the events are enabled on
> > exec(). And in this case, it doesn't synthesize any event for the
> > child since they'll be generated during exec(). But there's an window
> > between the enabling and the event generation.
> >
> > It used to be overcome since samples are only in kernel (so we always
> > have the map) and the comm is overridden by a later COMM event.
> > However it won't work if events are processed and displayed before the
> > COMM event overrides like in 'perf script'. This leads to those early
> > samples (like native_write_msr_safe) not having a comm but pid (like
> > ':15328').
> >
> > So it needs to synthesize COMM event for the child explicitly before
> > enabling so that it can have a correct comm. But at this time, the
> > comm will be "perf" since it's not exec-ed yet.
>
> Don't we use enable-on-exec?
>
> # perf record usleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
> # perf evlist -v
> cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
> IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
> enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2:
> 1, comm_exec: 1
> #
>
> Ok, but even then, if we use --show-task-events, we see that the comm
> appears only later...
>
> # perf script --show-task-events
> :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429
> usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4.
> usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so)
> usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429)
>
> While with your patch we see:
>
> # perf record usleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
> # perf script --show-task-events
> perf 0 0.000000: PERF_RECORD_COMM: perf:6571/6571
> perf 6571 28378.609330: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> perf 6571 28378.609334: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> perf 6571 28378.609335: 7 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> perf 6571 28378.609336: 163 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> perf 6571 28378.609337: 4795 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> perf 6571 28378.609340: 119283 cycles: ffffffff81189bdd perf_event_comm_output (/lib/modules/4
> usleep 6571 28378.609340: PERF_RECORD_COMM exec: usleep:6571/6571
> usleep 6571 28378.609380: 1900804 cycles: ffffffff810a0b40 find_next_iomem_res (/lib/modules/4.3.
> usleep 6571 28378.609879: PERF_RECORD_EXIT(6571:6571):(6571:6571)
>
> Better indeed, I'm adding this before/after so that the changeset comment
> can fully illustrate what happens.
>
> But you see there is still a problem with that synthesized COMM, right? the one
> coming from the kernel has the 6571 there, while the synthesized one doesn't...
> Checking...
>
> But I'll apply your patch.
Humm, I think it is working by accident, as you're not allocating space
for machine->id_hdr_size, please take a look at
perf_event__synthesize_thread_map().
Right now its not a problem this line in perf_event__prepare_comm():
memset(event->comm.comm + size, 0, machine->id_hdr_size);
Because perf_event is an union and some of its elements, like mmap/mmap2
have that PATH_MAX part, but its just a matter of the id_hdr_size
becoming bigger than that and we'll have a problem...
That zero there is probably because we're not synthesizing the
PERF_SAMPLE_ part, that would be encoded in this id_hdr_size area, we're
zeroing it all and then when showing it in perf script, we see zeros.
Would be good to do that, will try if you don't do it first 8-)
But have some other stuff to do first :-)
- Arnaldo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] perf record: Synthesize COMM event for a command line workload
2015-09-22 19:48 ` Arnaldo Carvalho de Melo
@ 2015-09-23 2:49 ` Namhyung Kim
[not found] ` <CA+JHD92DGR3+f4PMBbU+WgnCOQ11+JX44pLs_fOrAZ9Y3=wbtA@mail.gmail.com>
0 siblings, 1 reply; 6+ messages in thread
From: Namhyung Kim @ 2015-09-23 2:49 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern
Hi Arnaldo,
On Tue, Sep 22, 2015 at 04:48:25PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Sep 22, 2015 at 04:25:34PM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Tue, Sep 22, 2015 at 09:24:55AM +0900, Namhyung Kim escreveu:
> > > When perf creates a new child to profile, the events are enabled on
> > > exec(). And in this case, it doesn't synthesize any event for the
> > > child since they'll be generated during exec(). But there's an window
> > > between the enabling and the event generation.
> > >
> > > It used to be overcome since samples are only in kernel (so we always
> > > have the map) and the comm is overridden by a later COMM event.
> > > However it won't work if events are processed and displayed before the
> > > COMM event overrides like in 'perf script'. This leads to those early
> > > samples (like native_write_msr_safe) not having a comm but pid (like
> > > ':15328').
> > >
> > > So it needs to synthesize COMM event for the child explicitly before
> > > enabling so that it can have a correct comm. But at this time, the
> > > comm will be "perf" since it's not exec-ed yet.
> >
> > Don't we use enable-on-exec?
> >
> > # perf record usleep 1
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
> > # perf evlist -v
> > cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
> > IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
> > enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2:
> > 1, comm_exec: 1
> > #
> >
> > Ok, but even then, if we use --show-task-events, we see that the comm
> > appears only later...
> >
> > # perf script --show-task-events
> > :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429
> > usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4.
> > usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so)
> > usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429)
> >
> > While with your patch we see:
> >
> > # perf record usleep 1
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
> > # perf script --show-task-events
> > perf 0 0.000000: PERF_RECORD_COMM: perf:6571/6571
> > perf 6571 28378.609330: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > perf 6571 28378.609334: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > perf 6571 28378.609335: 7 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > perf 6571 28378.609336: 163 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > perf 6571 28378.609337: 4795 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
> > perf 6571 28378.609340: 119283 cycles: ffffffff81189bdd perf_event_comm_output (/lib/modules/4
> > usleep 6571 28378.609340: PERF_RECORD_COMM exec: usleep:6571/6571
> > usleep 6571 28378.609380: 1900804 cycles: ffffffff810a0b40 find_next_iomem_res (/lib/modules/4.3.
> > usleep 6571 28378.609879: PERF_RECORD_EXIT(6571:6571):(6571:6571)
> >
> > Better indeed, I'm adding this before/after so that the changeset comment
> > can fully illustrate what happens.
OK
> >
> > But you see there is still a problem with that synthesized COMM, right? the one
> > coming from the kernel has the 6571 there, while the synthesized one doesn't...
> > Checking...
> >
> > But I'll apply your patch.
Thanks
>
> Humm, I think it is working by accident, as you're not allocating space
> for machine->id_hdr_size, please take a look at
> perf_event__synthesize_thread_map().
>
> Right now its not a problem this line in perf_event__prepare_comm():
>
> memset(event->comm.comm + size, 0, machine->id_hdr_size);
>
> Because perf_event is an union and some of its elements, like mmap/mmap2
> have that PATH_MAX part, but its just a matter of the id_hdr_size
> becoming bigger than that and we'll have a problem...
Right. I'll send a fix to include the id_hdr part.
>
> That zero there is probably because we're not synthesizing the
> PERF_SAMPLE_ part, that would be encoded in this id_hdr_size area, we're
> zeroing it all and then when showing it in perf script, we see zeros.
>
> Would be good to do that, will try if you don't do it first 8-)
So do you want to generate the PERF_SAMPLE_ part for all synthesized events?
>
> But have some other stuff to do first :-)
:)
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] perf record: Synthesize COMM event for a command line workload
[not found] ` <CA+JHD92DGR3+f4PMBbU+WgnCOQ11+JX44pLs_fOrAZ9Y3=wbtA@mail.gmail.com>
@ 2015-09-23 4:52 ` Namhyung Kim
0 siblings, 0 replies; 6+ messages in thread
From: Namhyung Kim @ 2015-09-23 4:52 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Peter Zijlstra, Ingo Molnar, David Ahern, LKML,
Arnaldo Carvalho de Melo
On Wed, Sep 23, 2015 at 12:09:20AM -0300, Arnaldo Carvalho de Melo wrote:
> Le 23 sept. 2015 12:04 AM, "Namhyung Kim" <namhyung@kernel.org> a écrit :
> >
> > Hi Arnaldo,
> >
> > On Tue, Sep 22, 2015 at 04:48:25PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Humm, I think it is working by accident, as you're not allocating space
> > > for machine->id_hdr_size, please take a look at
> > > perf_event__synthesize_thread_map().
> > >
> > > Right now its not a problem this line in perf_event__prepare_comm():
> > >
> > > memset(event->comm.comm + size, 0, machine->id_hdr_size);
> > >
> > > Because perf_event is an union and some of its elements, like mmap/mmap2
> > > have that PATH_MAX part, but its just a matter of the id_hdr_size
> > > becoming bigger than that and we'll have a problem...
> >
> > Right. I'll send a fix to include the id_hdr part.
>
> Thanks!
>From 10123021c4e55f14d3bb8e6cc576694bb76d6699 Mon Sep 17 00:00:00 2001
From: Namhyung Kim <namhyung@kernel.org>
Date: Wed, 23 Sep 2015 11:51:40 +0900
Subject: [PATCH] perf record: Allocate area for sample_id_hdr in a synthesized
comm event
A previous patch added a synthesized comm event for forked child
process but it missed that the event should contain area for
sample_id_hdr at the end. It worked by accident since the perf_event
union contains bigger event structs like mmap_events.
This patch fixes it by dynamically allocating event struct including
those area like in perf_event__synthesize_thread_map().
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/builtin-record.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6d8302d4612f..5e01c070dbf2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -637,7 +637,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
* Let the child rip
*/
if (forks) {
- union perf_event event;
+ union perf_event *event;
+
+ event = malloc(sizeof(event->comm) + machine->id_hdr_size);
+ if (event == NULL) {
+ err = -ENOMEM;
+ goto out_child;
+ }
/*
* Some H/W events are generated before COMM event
@@ -645,10 +651,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
* cannot see a correct process name for those events.
* Synthesize COMM event to prevent it.
*/
- perf_event__synthesize_comm(tool, &event,
+ perf_event__synthesize_comm(tool, event,
rec->evlist->workload.pid,
process_synthesized_event,
- &session->machines.host);
+ machine);
+ free(event);
perf_evlist__start_workload(rec->evlist);
}
--
2.5.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [tip:perf/core] perf record: Synthesize COMM event for a command line workload
2015-09-22 0:24 [PATCH v2] perf record: Synthesize COMM event for a command line workload Namhyung Kim
2015-09-22 19:25 ` Arnaldo Carvalho de Melo
@ 2015-09-23 8:46 ` tip-bot for Namhyung Kim
1 sibling, 0 replies; 6+ messages in thread
From: tip-bot for Namhyung Kim @ 2015-09-23 8:46 UTC (permalink / raw)
To: linux-tip-commits
Cc: namhyung, jolsa, dsahern, a.p.zijlstra, mingo, tglx, acme, hpa,
linux-kernel
Commit-ID: e803cf97a4f90d31bcc2c9a1ea20fe9cdc12b2f9
Gitweb: http://git.kernel.org/tip/e803cf97a4f90d31bcc2c9a1ea20fe9cdc12b2f9
Author: Namhyung Kim <namhyung@kernel.org>
AuthorDate: Tue, 22 Sep 2015 09:24:55 +0900
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 22 Sep 2015 22:43:12 -0300
perf record: Synthesize COMM event for a command line workload
When perf creates a new child to profile, the events are enabled on
exec(). And in this case, it doesn't synthesize any event for the
child since they'll be generated during exec(). But there's an window
between the enabling and the event generation.
It used to be overcome since samples are only in kernel (so we always
have the map) and the comm is overridden by a later COMM event.
However it won't work if events are processed and displayed before the
COMM event overrides like in 'perf script'. This leads to those early
samples (like native_write_msr_safe) not having a comm but pid (like
':15328').
So it needs to synthesize COMM event for the child explicitly before
enabling so that it can have a correct comm. But at this time, the
comm will be "perf" since it's not exec-ed yet.
Committer note:
Before this patch:
# perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
# perf script --show-task-events
:4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
:4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429
usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4.
usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so)
usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429)
After:
# perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
# perf script --show-task
perf 0 0.000000: PERF_RECORD_COMM: perf:8446/8446
perf 8446 30154.038944: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 8446 30154.038948: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 8446 30154.038949: 9 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 8446 30154.038950: 230 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
perf 8446 30154.038951: 6772 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4.
usleep 8446 30154.038952: PERF_RECORD_COMM exec: usleep:8446/8446
usleep 8446 30154.038954: 196923 cycles: ffffffff81766440 _raw_spin_lock (/lib/modules/4.3.0-rc1
usleep 8446 30154.039021: 2292130 cycles: 7f609a173dc4 memcpy (/usr/lib64/ld-2.20.so)
usleep 8446 30154.039349: PERF_RECORD_EXIT(8446:8446):(8446:8446)
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1442881495-2928-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-record.c | 15 ++++++++++++++-
tools/perf/util/event.c | 2 +-
tools/perf/util/event.h | 5 +++++
3 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 142eeb3..a01c8ae 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -636,8 +636,21 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
/*
* Let the child rip
*/
- if (forks)
+ if (forks) {
+ union perf_event event;
+ /*
+ * Some H/W events are generated before COMM event
+ * which is emitted during exec(), so perf script
+ * cannot see a correct process name for those events.
+ * Synthesize COMM event to prevent it.
+ */
+ perf_event__synthesize_comm(tool, &event,
+ rec->evlist->workload.pid,
+ process_synthesized_event,
+ machine);
+
perf_evlist__start_workload(rec->evlist);
+ }
if (opts->initial_delay) {
usleep(opts->initial_delay * 1000);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 497157a..6214ad4 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -167,7 +167,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid,
return 0;
}
-static pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
union perf_event *event, pid_t pid,
perf_event__handler_t process,
struct machine *machine)
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index f729df5..be5cbc7 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -478,6 +478,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
const struct perf_sample *sample,
bool swapped);
+pid_t perf_event__synthesize_comm(struct perf_tool *tool,
+ union perf_event *event, pid_t pid,
+ perf_event__handler_t process,
+ struct machine *machine);
+
int perf_event__synthesize_mmap_events(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid,
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-09-23 8:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-22 0:24 [PATCH v2] perf record: Synthesize COMM event for a command line workload Namhyung Kim
2015-09-22 19:25 ` Arnaldo Carvalho de Melo
2015-09-22 19:48 ` Arnaldo Carvalho de Melo
2015-09-23 2:49 ` Namhyung Kim
[not found] ` <CA+JHD92DGR3+f4PMBbU+WgnCOQ11+JX44pLs_fOrAZ9Y3=wbtA@mail.gmail.com>
2015-09-23 4:52 ` Namhyung Kim
2015-09-23 8:46 ` [tip:perf/core] " tip-bot for Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox