All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Jiri Olsa <jolsa@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	David Ahern <dsahern@gmail.com>
Subject: Re: [PATCH 2/2] perf record: Synthesize COMM event for a command line workload
Date: Tue, 22 Sep 2015 09:04:52 +0900	[thread overview]
Message-ID: <20150922000452.GA5902@sejong> (raw)
In-Reply-To: <20150921193926.GA4186@kernel.org>

Hi Arnaldo,

On Mon, Sep 21, 2015 at 04:39:26PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Sep 21, 2015 at 09:26:49AM +0900, Namhyung Kim escreveu:
> > When perf creates a new child to profile, the events are enabled on
> > exec().  And in this case, it doesn't synthesize any event for the
> > child since they'll be generated during exec().  But there's an window
> > between the enabling and the event generation.
> > 
> > It used to be overcome since samples are only in kernel (so we always
> > have the map) and the comm is overridden by a later COMM event.
> > However it won't work if events are processed and displayed before the
> > COMM event overrides like in 'perf script'.  This leads to those early
> > samples (like native_write_msr_safe) not having a comm but pid (like
> > ':15328').
> > 
> > So it needs to synthesize COMM event for the child explicitly before
> > enabling so that it can have a correct comm.  But at this time, the
> > comm will be "perf" since it's not exec-ed yet.
> > 
> > Acked-by: Jiri Olsa <jolsa@redhat.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> > ---
> >  tools/perf/builtin-record.c | 41 ++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 40 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 142eeb341b29..b83373adb9f8 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -469,6 +469,43 @@ static void workload_exec_failed_signal(int signo __maybe_unused,
> >  	child_finished = 1;
> >  }
> >  
> > +static int synthesize_workload_comm_event(struct perf_evlist *evlist, void *arg)
> > +{
> > +	union perf_event *event;
> > +	struct record *rec = arg;
> > +	struct machine *machine = &rec->session->machines.host;
> > +	int pid = evlist->workload.pid;
> > +	const char *comm_str = program_invocation_short_name;
> > +	size_t comm_size, total_size;
> > +	int ret;
> > +
> > +	comm_size = PERF_ALIGN(strlen(comm_str) + 1, sizeof(u64));
> > +	total_size = sizeof(event->comm) + machine->id_hdr_size;
> > +	/*
> > +	 * (aligned) comm size might be smaller than expected size
> > +	 * (i.e.  size of event->comm.comm[]), in that case it needs
> > +	 * to shrink the total size.
> > +	 */
> > +	if (comm_size < sizeof(event->comm.comm))
> > +		total_size -= sizeof(event->comm.comm) - comm_size;
> > +
> > +	event = zalloc(total_size);
> > +	if (event == NULL)
> > +		return -ENOMEM;
> > +
> > +	event->comm.header.type = PERF_RECORD_COMM;
> > +	event->comm.header.size = total_size;
> > +
> > +	event->comm.pid = pid;
> > +	event->comm.tid = pid;
> > +	strncpy(event->comm.comm, comm_str, comm_size);
> > +
> > +	ret = record__write(rec, event, total_size);
> > +
> > +	free(event);
> > +	return ret;
> > +}
> > +
> >  static void snapshot_sig_handler(int sig);
> >  
> >  static int __cmd_record(struct record *rec, int argc, const char **argv)
> > @@ -637,7 +674,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
> >  	 * Let the child rip
> >  	 */
> >  	if (forks)
> > -		perf_evlist__start_workload(rec->evlist);
> > +		perf_evlist__start_workload_ex(rec->evlist,
> > +					       synthesize_workload_comm_event,
> > +					       rec);
> 
> Why not call it directly? I.e.:
> 
> 	if (forks) {
> 		err = synthesize_workload_comm_event(evlist, rec);
> 		if (!err)
> 			err = perf_evlist__start_workload(rec->evlist);
> 	}
> 
> Since, from what I saw, the very first thing that
> perf_evlist__start_workload_ex() does is to call the callback?

I originally thought that it'd be used by other commands too.  I
checked that 'perf trace' has similar code so I generalized it with
callbacks.  But then I realized the perf trace generates events only
after exec() so I dropped the patch for it.


> 
> Also, don't we have already a synthesize_comm routine? I.e. can't
> perf_event__prepare_comm() be used here?

Ok, it'd be better exporting perf_event__synthesize_comm() for
consistency then.  Will send v2 soon.

Thanks,
Namhyung

      reply	other threads:[~2015-09-22  0:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-21  0:26 [PATCH 1/2] perf tools: Introduce perf_evlist__start_workload_ex() Namhyung Kim
2015-09-21  0:26 ` [PATCH 2/2] perf record: Synthesize COMM event for a command line workload Namhyung Kim
2015-09-21 19:39   ` Arnaldo Carvalho de Melo
2015-09-22  0:04     ` Namhyung Kim [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150922000452.GA5902@sejong \
    --to=namhyung@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=dsahern@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.