public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stanislav Fomichev <stfomichev@yandex-team.ru>,
	Chia-I Wu <olvaffe@gmail.com>,
	a.p.zijlstra@chello.nl, paulus@samba.org, mingo@redhat.com,
	linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] perf timechart: dynamically determine event data offset
Date: Wed, 27 Nov 2013 23:17:43 +0900	[thread overview]
Message-ID: <1385561863.1685.30.camel@leonhard> (raw)
In-Reply-To: <20131127134434.GA11498@ghostprotocols.net>

Hi Arnaldo,

2013-11-27 (수), 10:44 -0300, Arnaldo Carvalho de Melo:
> Em Wed, Nov 27, 2013 at 05:49:02PM +0900, Namhyung Kim escreveu:
> > On Tue, 26 Nov 2013 18:54:37 +0400, Stanislav Fomichev wrote:
> > > Since b000c8065a92 "tracing: Remove the extra 4 bytes of padding in events"
> > > removed padding bytes, perf timechart got out of sync with the kernel's
> > > trace_entry structure.
> 
> > > We can't just align perf's trace_entry definition with the kernel because we
> > > want timechart to continue working with old perf.data. Instead, we now
> > > calculate event data offset dynamically using offset of first non-common
> > > event field in the perf.data.
>  
> > [SNIP]
> > > @@ -304,13 +322,11 @@ struct trace_entry {
> > >  	unsigned char		flags;
> > >  	unsigned char		preempt_count;
> > >  	int			pid;
> > > -	int			lock_depth;
> > >  };
>  
> > I had no chance to look into the timechart code in detail, but this is
> > not good.  The format of each trace event (so the struct trace_entry)
> > was described in the format file under the event directory on debugfs.
> > For cpu_frequency event, I get the following:
>  
> >   $ cat /sys/kernel/debug/tracing/events/power/cpu_frequency/format 
> >   name: cpu_frequency
> >   ID: 315
> >   format:
> > 	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
> > 	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
> > 	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
> > 	field:int common_pid;	offset:4;	size:4;	signed:1;
>  
> > 	field:u32 state;	offset:8;	size:4;	signed:0;
> > 	field:u32 cpu_id;	offset:12;	size:4;	signed:0;
>  
> >   print fmt: "state=%lu cpu_id=%lu", (unsigned long)REC->state, (unsigned long)REC->cpu_id
>  
> > So it's not same as above struct trace_entry even with your change.
> 
> It is... Albeit working just with patches is good and dandy and I even
> was discussing with Jiri how patches should be clean and small, you're
> not seeing the whole picture, see what is at the line just before
> "unsigned char flags":
> 
> struct trace_entry {
>         unsigned short          type;
>         unsigned char           flags;
>         unsigned char           preempt_count;
>         int                     pid;
>         int                     lock_depth;
> };

Argh, I missed that part, sorry. :)

> 
> Looking with pahole to see the offsets on a 64-bit arch:
> 
> [acme@ssdandy linux]$ pahole -C trace_entry /tmp/build/perf/builtin-timechart.o 
> struct trace_entry {
> 	short unsigned int         type;                 /*     0     2 */
> 	unsigned char              flags;                /*     2     1 */
> 	unsigned char              preempt_count;        /*     3     1 */
> 	int                        pid;                  /*     4     4 */
> 	int                        lock_depth;           /*     8     4 */
> 
> 	/* size: 12, cachelines: 1, members: 5 */
> 	/* last cacheline: 12 bytes */
> };
> [acme@ssdandy linux]$
> [acme@ssdandy linux]$ uname -a
> Linux ssdandy 3.12.0-rc6+ #2 SMP Mon Oct 21 11:19:07 BRT 2013 x86_64 x86_64 x86_64 GNU/Linux
> 
> And now on a 32-bit arch:
> 
> acme@linux-goap:~> pahole -C trace_entry /tmp/build/perf/builtin-timechart.o 
> struct trace_entry {
> 	short unsigned int         type;                 /*     0     2 */
> 	unsigned char              flags;                /*     2     1 */
> 	unsigned char              preempt_count;        /*     3     1 */
> 	int                        pid;                  /*     4     4 */
> 	int                        lock_depth;           /*     8     4 */
> 
> 	/* size: 12, cachelines: 1, members: 5 */
> 	/* last cacheline: 12 bytes */
> };
> acme@linux-goap:~> uname -a
> Linux linux-goap 3.7.10-1.16-desktop #1 SMP PREEMPT Fri May 31 20:21:23 UTC 2013 (97c14ba) i686 i686 i386 GNU/Linux
> acme@linux-goap:~>
> 
> Same signature, 32-bit, 64-bit userland, so whoever wrote timechart, Arjan, I
> think, made no mistakes at using the kernel exported interface, choosing the
> most efficient way to extract the data, casting to a struct.
> 
> Yeah, using the interface that searches for a field name to get the offset and
> then add it to a pointer to then cast to the type will allow maximum flexibility,
> but is not really efficient.
> 
> Doing that for something that is not performance critical (probably) as
> timechart probably is not a problem, but so is not a problem using the patch
> that does the cast after finding the offset of the first non-common field.

I'm not sure how it affects the performance really.  But in most cases,
using the libtraceevent would be a better solution IMHO.

> 
> Having said that, I'll take the latest patch, using perf_evsel__intval et all,
> if Namhyung sees no further problems with it.

Okay, after a quick glance I don't see anymore issue.  You can add my
Ack there.

> 
> BTW: in 'perf trace' I recently did some work to cache the offsets on a per evsel
> private area, so that the string lookup is not done anymore in the raw_syscalls
> tracepoint processing.

Good job. :)

Thanks,
Namhyung




  reply	other threads:[~2013-11-27 14:17 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-07  6:48 [PATCH] perf timechart: remove lock_depth from trace_entry Chia-I Wu
2013-10-22 10:32 ` Stanislav Fomichev
2013-11-25 18:21   ` Arnaldo Carvalho de Melo
2013-11-26 11:05     ` Stanislav Fomichev
2013-11-26 12:10       ` Arnaldo Carvalho de Melo
2013-11-26 13:47         ` Stanislav Fomichev
2013-11-26 13:57           ` Arnaldo Carvalho de Melo
2013-11-26 14:54             ` [PATCH] perf timechart: dynamically determine event data offset Stanislav Fomichev
2013-11-26 22:04               ` Arnaldo Carvalho de Melo
2013-11-27  8:49               ` Namhyung Kim
2013-11-27  9:01                 ` Stanislav Fomichev
2013-11-27 10:45                   ` [PATCH] perf timechart: dynamically determine event fields offset Stanislav Fomichev
2013-11-30 12:53                     ` [tip:perf/core] " tip-bot for Stanislav Fomichev
2013-11-27 13:44                 ` [PATCH] perf timechart: dynamically determine event data offset Arnaldo Carvalho de Melo
2013-11-27 14:17                   ` Namhyung Kim [this message]
2013-11-27 14:41                     ` Arnaldo Carvalho de Melo
2013-11-27 14:51                       ` Namhyung Kim
2013-11-27 14:55                         ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1385561863.1685.30.camel@leonhard \
    --to=namhyung@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@ghostprotocols.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=olvaffe@gmail.com \
    --cc=paulus@samba.org \
    --cc=rostedt@goodmis.org \
    --cc=stfomichev@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox