* [PATCH v2] perf record: Synthesize COMM event for a command line workload @ 2015-09-22 0:24 Namhyung Kim 2015-09-22 19:25 ` Arnaldo Carvalho de Melo 2015-09-23 8:46 ` [tip:perf/core] " tip-bot for Namhyung Kim 0 siblings, 2 replies; 6+ messages in thread From: Namhyung Kim @ 2015-09-22 0:24 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern When perf creates a new child to profile, the events are enabled on exec(). And in this case, it doesn't synthesize any event for the child since they'll be generated during exec(). But there's an window between the enabling and the event generation. It used to be overcome since samples are only in kernel (so we always have the map) and the comm is overridden by a later COMM event. However it won't work if events are processed and displayed before the COMM event overrides like in 'perf script'. This leads to those early samples (like native_write_msr_safe) not having a comm but pid (like ':15328'). So it needs to synthesize COMM event for the child explicitly before enabling so that it can have a correct comm. But at this time, the comm will be "perf" since it's not exec-ed yet. Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/builtin-record.c | 16 +++++++++++++++- tools/perf/util/event.c | 2 +- tools/perf/util/event.h | 5 +++++ 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 142eeb341b29..6d8302d4612f 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -636,8 +636,22 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) /* * Let the child rip */ - if (forks) + if (forks) { + union perf_event event; + + /* + * Some H/W events are generated before COMM event + * which is emitted during exec(), so perf script + * cannot see a correct process name for those events. + * Synthesize COMM event to prevent it. + */ + perf_event__synthesize_comm(tool, &event, + rec->evlist->workload.pid, + process_synthesized_event, + &session->machines.host); + perf_evlist__start_workload(rec->evlist); + } if (opts->initial_delay) { usleep(opts->initial_delay * 1000); diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 497157affc9c..6214ad47d554 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -167,7 +167,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid, return 0; } -static pid_t perf_event__synthesize_comm(struct perf_tool *tool, +pid_t perf_event__synthesize_comm(struct perf_tool *tool, union perf_event *event, pid_t pid, perf_event__handler_t process, struct machine *machine) diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h index f729df5e25e6..be5cbc7be889 100644 --- a/tools/perf/util/event.h +++ b/tools/perf/util/event.h @@ -478,6 +478,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, const struct perf_sample *sample, bool swapped); +pid_t perf_event__synthesize_comm(struct perf_tool *tool, + union perf_event *event, pid_t pid, + perf_event__handler_t process, + struct machine *machine); + int perf_event__synthesize_mmap_events(struct perf_tool *tool, union perf_event *event, pid_t pid, pid_t tgid, -- 2.5.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] perf record: Synthesize COMM event for a command line workload 2015-09-22 0:24 [PATCH v2] perf record: Synthesize COMM event for a command line workload Namhyung Kim @ 2015-09-22 19:25 ` Arnaldo Carvalho de Melo 2015-09-22 19:48 ` Arnaldo Carvalho de Melo 2015-09-23 8:46 ` [tip:perf/core] " tip-bot for Namhyung Kim 1 sibling, 1 reply; 6+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-09-22 19:25 UTC (permalink / raw) To: Namhyung Kim; +Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern Em Tue, Sep 22, 2015 at 09:24:55AM +0900, Namhyung Kim escreveu: > When perf creates a new child to profile, the events are enabled on > exec(). And in this case, it doesn't synthesize any event for the > child since they'll be generated during exec(). But there's an window > between the enabling and the event generation. > > It used to be overcome since samples are only in kernel (so we always > have the map) and the comm is overridden by a later COMM event. > However it won't work if events are processed and displayed before the > COMM event overrides like in 'perf script'. This leads to those early > samples (like native_write_msr_safe) not having a comm but pid (like > ':15328'). > > So it needs to synthesize COMM event for the child explicitly before > enabling so that it can have a correct comm. But at this time, the > comm will be "perf" since it's not exec-ed yet. Don't we use enable-on-exec? # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # Ok, but even then, if we use --show-task-events, we see that the comm appears only later... # perf script --show-task-events :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429 usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4. usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so) usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429) While with your patch we see: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task-events perf 0 0.000000: PERF_RECORD_COMM: perf:6571/6571 perf 6571 28378.609330: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 6571 28378.609334: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 6571 28378.609335: 7 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 6571 28378.609336: 163 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 6571 28378.609337: 4795 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 6571 28378.609340: 119283 cycles: ffffffff81189bdd perf_event_comm_output (/lib/modules/4 usleep 6571 28378.609340: PERF_RECORD_COMM exec: usleep:6571/6571 usleep 6571 28378.609380: 1900804 cycles: ffffffff810a0b40 find_next_iomem_res (/lib/modules/4.3. usleep 6571 28378.609879: PERF_RECORD_EXIT(6571:6571):(6571:6571) Better indeed, I'm adding this before/after so that the changeset comment can fully illustrate what happens. But you see there is still a problem with that synthesized COMM, right? the one coming from the kernel has the 6571 there, while the synthesized one doesn't... Checking... But I'll apply your patch. - Arnaldo > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/perf/builtin-record.c | 16 +++++++++++++++- > tools/perf/util/event.c | 2 +- > tools/perf/util/event.h | 5 +++++ > 3 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c > index 142eeb341b29..6d8302d4612f 100644 > --- a/tools/perf/builtin-record.c > +++ b/tools/perf/builtin-record.c > @@ -636,8 +636,22 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) > /* > * Let the child rip > */ > - if (forks) > + if (forks) { > + union perf_event event; > + > + /* > + * Some H/W events are generated before COMM event > + * which is emitted during exec(), so perf script > + * cannot see a correct process name for those events. > + * Synthesize COMM event to prevent it. > + */ > + perf_event__synthesize_comm(tool, &event, > + rec->evlist->workload.pid, > + process_synthesized_event, > + &session->machines.host); > + > perf_evlist__start_workload(rec->evlist); > + } > > if (opts->initial_delay) { > usleep(opts->initial_delay * 1000); > diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c > index 497157affc9c..6214ad47d554 100644 > --- a/tools/perf/util/event.c > +++ b/tools/perf/util/event.c > @@ -167,7 +167,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid, > return 0; > } > > -static pid_t perf_event__synthesize_comm(struct perf_tool *tool, > +pid_t perf_event__synthesize_comm(struct perf_tool *tool, > union perf_event *event, pid_t pid, > perf_event__handler_t process, > struct machine *machine) > diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h > index f729df5e25e6..be5cbc7be889 100644 > --- a/tools/perf/util/event.h > +++ b/tools/perf/util/event.h > @@ -478,6 +478,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, > const struct perf_sample *sample, > bool swapped); > > +pid_t perf_event__synthesize_comm(struct perf_tool *tool, > + union perf_event *event, pid_t pid, > + perf_event__handler_t process, > + struct machine *machine); > + > int perf_event__synthesize_mmap_events(struct perf_tool *tool, > union perf_event *event, > pid_t pid, pid_t tgid, > -- > 2.5.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] perf record: Synthesize COMM event for a command line workload 2015-09-22 19:25 ` Arnaldo Carvalho de Melo @ 2015-09-22 19:48 ` Arnaldo Carvalho de Melo 2015-09-23 2:49 ` Namhyung Kim 0 siblings, 1 reply; 6+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-09-22 19:48 UTC (permalink / raw) To: Namhyung Kim; +Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern Em Tue, Sep 22, 2015 at 04:25:34PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Tue, Sep 22, 2015 at 09:24:55AM +0900, Namhyung Kim escreveu: > > When perf creates a new child to profile, the events are enabled on > > exec(). And in this case, it doesn't synthesize any event for the > > child since they'll be generated during exec(). But there's an window > > between the enabling and the event generation. > > > > It used to be overcome since samples are only in kernel (so we always > > have the map) and the comm is overridden by a later COMM event. > > However it won't work if events are processed and displayed before the > > COMM event overrides like in 'perf script'. This leads to those early > > samples (like native_write_msr_safe) not having a comm but pid (like > > ':15328'). > > > > So it needs to synthesize COMM event for the child explicitly before > > enabling so that it can have a correct comm. But at this time, the > > comm will be "perf" since it's not exec-ed yet. > > Don't we use enable-on-exec? > > # perf record usleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] > # perf evlist -v > cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: > IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, > enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: > 1, comm_exec: 1 > # > > Ok, but even then, if we use --show-task-events, we see that the comm > appears only later... > > # perf script --show-task-events > :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429 > usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4. > usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so) > usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429) > > While with your patch we see: > > # perf record usleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] > # perf script --show-task-events > perf 0 0.000000: PERF_RECORD_COMM: perf:6571/6571 > perf 6571 28378.609330: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > perf 6571 28378.609334: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > perf 6571 28378.609335: 7 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > perf 6571 28378.609336: 163 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > perf 6571 28378.609337: 4795 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > perf 6571 28378.609340: 119283 cycles: ffffffff81189bdd perf_event_comm_output (/lib/modules/4 > usleep 6571 28378.609340: PERF_RECORD_COMM exec: usleep:6571/6571 > usleep 6571 28378.609380: 1900804 cycles: ffffffff810a0b40 find_next_iomem_res (/lib/modules/4.3. > usleep 6571 28378.609879: PERF_RECORD_EXIT(6571:6571):(6571:6571) > > Better indeed, I'm adding this before/after so that the changeset comment > can fully illustrate what happens. > > But you see there is still a problem with that synthesized COMM, right? the one > coming from the kernel has the 6571 there, while the synthesized one doesn't... > Checking... > > But I'll apply your patch. Humm, I think it is working by accident, as you're not allocating space for machine->id_hdr_size, please take a look at perf_event__synthesize_thread_map(). Right now its not a problem this line in perf_event__prepare_comm(): memset(event->comm.comm + size, 0, machine->id_hdr_size); Because perf_event is an union and some of its elements, like mmap/mmap2 have that PATH_MAX part, but its just a matter of the id_hdr_size becoming bigger than that and we'll have a problem... That zero there is probably because we're not synthesizing the PERF_SAMPLE_ part, that would be encoded in this id_hdr_size area, we're zeroing it all and then when showing it in perf script, we see zeros. Would be good to do that, will try if you don't do it first 8-) But have some other stuff to do first :-) - Arnaldo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] perf record: Synthesize COMM event for a command line workload 2015-09-22 19:48 ` Arnaldo Carvalho de Melo @ 2015-09-23 2:49 ` Namhyung Kim [not found] ` <CA+JHD92DGR3+f4PMBbU+WgnCOQ11+JX44pLs_fOrAZ9Y3=wbtA@mail.gmail.com> 0 siblings, 1 reply; 6+ messages in thread From: Namhyung Kim @ 2015-09-23 2:49 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Ingo Molnar, Peter Zijlstra, Jiri Olsa, LKML, David Ahern Hi Arnaldo, On Tue, Sep 22, 2015 at 04:48:25PM -0300, Arnaldo Carvalho de Melo wrote: > Em Tue, Sep 22, 2015 at 04:25:34PM -0300, Arnaldo Carvalho de Melo escreveu: > > Em Tue, Sep 22, 2015 at 09:24:55AM +0900, Namhyung Kim escreveu: > > > When perf creates a new child to profile, the events are enabled on > > > exec(). And in this case, it doesn't synthesize any event for the > > > child since they'll be generated during exec(). But there's an window > > > between the enabling and the event generation. > > > > > > It used to be overcome since samples are only in kernel (so we always > > > have the map) and the comm is overridden by a later COMM event. > > > However it won't work if events are processed and displayed before the > > > COMM event overrides like in 'perf script'. This leads to those early > > > samples (like native_write_msr_safe) not having a comm but pid (like > > > ':15328'). > > > > > > So it needs to synthesize COMM event for the child explicitly before > > > enabling so that it can have a correct comm. But at this time, the > > > comm will be "perf" since it's not exec-ed yet. > > > > Don't we use enable-on-exec? > > > > # perf record usleep 1 > > [ perf record: Woken up 1 times to write data ] > > [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] > > # perf evlist -v > > cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: > > IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, > > enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: > > 1, comm_exec: 1 > > # > > > > Ok, but even then, if we use --show-task-events, we see that the comm > > appears only later... > > > > # perf script --show-task-events > > :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429 > > usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4. > > usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so) > > usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429) > > > > While with your patch we see: > > > > # perf record usleep 1 > > [ perf record: Woken up 1 times to write data ] > > [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] > > # perf script --show-task-events > > perf 0 0.000000: PERF_RECORD_COMM: perf:6571/6571 > > perf 6571 28378.609330: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > perf 6571 28378.609334: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > perf 6571 28378.609335: 7 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > perf 6571 28378.609336: 163 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > perf 6571 28378.609337: 4795 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. > > perf 6571 28378.609340: 119283 cycles: ffffffff81189bdd perf_event_comm_output (/lib/modules/4 > > usleep 6571 28378.609340: PERF_RECORD_COMM exec: usleep:6571/6571 > > usleep 6571 28378.609380: 1900804 cycles: ffffffff810a0b40 find_next_iomem_res (/lib/modules/4.3. > > usleep 6571 28378.609879: PERF_RECORD_EXIT(6571:6571):(6571:6571) > > > > Better indeed, I'm adding this before/after so that the changeset comment > > can fully illustrate what happens. OK > > > > But you see there is still a problem with that synthesized COMM, right? the one > > coming from the kernel has the 6571 there, while the synthesized one doesn't... > > Checking... > > > > But I'll apply your patch. Thanks > > Humm, I think it is working by accident, as you're not allocating space > for machine->id_hdr_size, please take a look at > perf_event__synthesize_thread_map(). > > Right now its not a problem this line in perf_event__prepare_comm(): > > memset(event->comm.comm + size, 0, machine->id_hdr_size); > > Because perf_event is an union and some of its elements, like mmap/mmap2 > have that PATH_MAX part, but its just a matter of the id_hdr_size > becoming bigger than that and we'll have a problem... Right. I'll send a fix to include the id_hdr part. > > That zero there is probably because we're not synthesizing the > PERF_SAMPLE_ part, that would be encoded in this id_hdr_size area, we're > zeroing it all and then when showing it in perf script, we see zeros. > > Would be good to do that, will try if you don't do it first 8-) So do you want to generate the PERF_SAMPLE_ part for all synthesized events? > > But have some other stuff to do first :-) :) Thanks, Namhyung ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CA+JHD92DGR3+f4PMBbU+WgnCOQ11+JX44pLs_fOrAZ9Y3=wbtA@mail.gmail.com>]
* Re: [PATCH v2] perf record: Synthesize COMM event for a command line workload [not found] ` <CA+JHD92DGR3+f4PMBbU+WgnCOQ11+JX44pLs_fOrAZ9Y3=wbtA@mail.gmail.com> @ 2015-09-23 4:52 ` Namhyung Kim 0 siblings, 0 replies; 6+ messages in thread From: Namhyung Kim @ 2015-09-23 4:52 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Jiri Olsa, Peter Zijlstra, Ingo Molnar, David Ahern, LKML, Arnaldo Carvalho de Melo On Wed, Sep 23, 2015 at 12:09:20AM -0300, Arnaldo Carvalho de Melo wrote: > Le 23 sept. 2015 12:04 AM, "Namhyung Kim" <namhyung@kernel.org> a écrit : > > > > Hi Arnaldo, > > > > On Tue, Sep 22, 2015 at 04:48:25PM -0300, Arnaldo Carvalho de Melo wrote: > > > Humm, I think it is working by accident, as you're not allocating space > > > for machine->id_hdr_size, please take a look at > > > perf_event__synthesize_thread_map(). > > > > > > Right now its not a problem this line in perf_event__prepare_comm(): > > > > > > memset(event->comm.comm + size, 0, machine->id_hdr_size); > > > > > > Because perf_event is an union and some of its elements, like mmap/mmap2 > > > have that PATH_MAX part, but its just a matter of the id_hdr_size > > > becoming bigger than that and we'll have a problem... > > > > Right. I'll send a fix to include the id_hdr part. > > Thanks! >From 10123021c4e55f14d3bb8e6cc576694bb76d6699 Mon Sep 17 00:00:00 2001 From: Namhyung Kim <namhyung@kernel.org> Date: Wed, 23 Sep 2015 11:51:40 +0900 Subject: [PATCH] perf record: Allocate area for sample_id_hdr in a synthesized comm event A previous patch added a synthesized comm event for forked child process but it missed that the event should contain area for sample_id_hdr at the end. It worked by accident since the perf_event union contains bigger event structs like mmap_events. This patch fixes it by dynamically allocating event struct including those area like in perf_event__synthesize_thread_map(). Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/builtin-record.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 6d8302d4612f..5e01c070dbf2 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -637,7 +637,13 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) * Let the child rip */ if (forks) { - union perf_event event; + union perf_event *event; + + event = malloc(sizeof(event->comm) + machine->id_hdr_size); + if (event == NULL) { + err = -ENOMEM; + goto out_child; + } /* * Some H/W events are generated before COMM event @@ -645,10 +651,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) * cannot see a correct process name for those events. * Synthesize COMM event to prevent it. */ - perf_event__synthesize_comm(tool, &event, + perf_event__synthesize_comm(tool, event, rec->evlist->workload.pid, process_synthesized_event, - &session->machines.host); + machine); + free(event); perf_evlist__start_workload(rec->evlist); } -- 2.5.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [tip:perf/core] perf record: Synthesize COMM event for a command line workload 2015-09-22 0:24 [PATCH v2] perf record: Synthesize COMM event for a command line workload Namhyung Kim 2015-09-22 19:25 ` Arnaldo Carvalho de Melo @ 2015-09-23 8:46 ` tip-bot for Namhyung Kim 1 sibling, 0 replies; 6+ messages in thread From: tip-bot for Namhyung Kim @ 2015-09-23 8:46 UTC (permalink / raw) To: linux-tip-commits Cc: namhyung, jolsa, dsahern, a.p.zijlstra, mingo, tglx, acme, hpa, linux-kernel Commit-ID: e803cf97a4f90d31bcc2c9a1ea20fe9cdc12b2f9 Gitweb: http://git.kernel.org/tip/e803cf97a4f90d31bcc2c9a1ea20fe9cdc12b2f9 Author: Namhyung Kim <namhyung@kernel.org> AuthorDate: Tue, 22 Sep 2015 09:24:55 +0900 Committer: Arnaldo Carvalho de Melo <acme@redhat.com> CommitDate: Tue, 22 Sep 2015 22:43:12 -0300 perf record: Synthesize COMM event for a command line workload When perf creates a new child to profile, the events are enabled on exec(). And in this case, it doesn't synthesize any event for the child since they'll be generated during exec(). But there's an window between the enabling and the event generation. It used to be overcome since samples are only in kernel (so we always have the map) and the comm is overridden by a later COMM event. However it won't work if events are processed and displayed before the COMM event overrides like in 'perf script'. This leads to those early samples (like native_write_msr_safe) not having a comm but pid (like ':15328'). So it needs to synthesize COMM event for the child explicitly before enabling so that it can have a correct comm. But at this time, the comm will be "perf" since it's not exec-ed yet. Committer note: Before this patch: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task-events :4429 4429 27909.079372: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079375: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079376: 10 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079377: 223 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. :4429 4429 27909.079378: 6571 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 4429 27909.079380: PERF_RECORD_COMM exec: usleep:4429/4429 usleep 4429 27909.079381: 185403 cycles: ffffffff810a72d3 flush_signal_handlers (/lib/modules/4. usleep 4429 27909.079444: 2241110 cycles: 7fc575355be3 _dl_start (/usr/lib64/ld-2.20.so) usleep 4429 27909.079875: PERF_RECORD_EXIT(4429:4429):(4429:4429) After: # perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # perf script --show-task perf 0 0.000000: PERF_RECORD_COMM: perf:8446/8446 perf 8446 30154.038944: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038948: 1 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038949: 9 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038950: 230 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. perf 8446 30154.038951: 6772 cycles: ffffffff8105f45a native_write_msr_safe (/lib/modules/4. usleep 8446 30154.038952: PERF_RECORD_COMM exec: usleep:8446/8446 usleep 8446 30154.038954: 196923 cycles: ffffffff81766440 _raw_spin_lock (/lib/modules/4.3.0-rc1 usleep 8446 30154.039021: 2292130 cycles: 7f609a173dc4 memcpy (/usr/lib64/ld-2.20.so) usleep 8446 30154.039349: PERF_RECORD_EXIT(8446:8446):(8446:8446) # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1442881495-2928-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/builtin-record.c | 15 ++++++++++++++- tools/perf/util/event.c | 2 +- tools/perf/util/event.h | 5 +++++ 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 142eeb3..a01c8ae 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -636,8 +636,21 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) /* * Let the child rip */ - if (forks) + if (forks) { + union perf_event event; + /* + * Some H/W events are generated before COMM event + * which is emitted during exec(), so perf script + * cannot see a correct process name for those events. + * Synthesize COMM event to prevent it. + */ + perf_event__synthesize_comm(tool, &event, + rec->evlist->workload.pid, + process_synthesized_event, + machine); + perf_evlist__start_workload(rec->evlist); + } if (opts->initial_delay) { usleep(opts->initial_delay * 1000); diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 497157a..6214ad4 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -167,7 +167,7 @@ static int perf_event__prepare_comm(union perf_event *event, pid_t pid, return 0; } -static pid_t perf_event__synthesize_comm(struct perf_tool *tool, +pid_t perf_event__synthesize_comm(struct perf_tool *tool, union perf_event *event, pid_t pid, perf_event__handler_t process, struct machine *machine) diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h index f729df5..be5cbc7 100644 --- a/tools/perf/util/event.h +++ b/tools/perf/util/event.h @@ -478,6 +478,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type, const struct perf_sample *sample, bool swapped); +pid_t perf_event__synthesize_comm(struct perf_tool *tool, + union perf_event *event, pid_t pid, + perf_event__handler_t process, + struct machine *machine); + int perf_event__synthesize_mmap_events(struct perf_tool *tool, union perf_event *event, pid_t pid, pid_t tgid, ^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-09-23 8:46 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-22 0:24 [PATCH v2] perf record: Synthesize COMM event for a command line workload Namhyung Kim
2015-09-22 19:25 ` Arnaldo Carvalho de Melo
2015-09-22 19:48 ` Arnaldo Carvalho de Melo
2015-09-23 2:49 ` Namhyung Kim
[not found] ` <CA+JHD92DGR3+f4PMBbU+WgnCOQ11+JX44pLs_fOrAZ9Y3=wbtA@mail.gmail.com>
2015-09-23 4:52 ` Namhyung Kim
2015-09-23 8:46 ` [tip:perf/core] " tip-bot for Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox