* Perf record format portability @ 2012-05-15 15:27 Dmitry Antipov 2012-05-15 15:51 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 18+ messages in thread From: Dmitry Antipov @ 2012-05-15 15:27 UTC (permalink / raw) To: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo Cc: Amit Kucheria, linaro-dev, linux-kernel Hello, are there any thoughts on how much of the perf.data is portable and how much it should be? I'm interesting in recording scheduler activity on one machine and then replaying on another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it work with a small subset of recorded events (for example, sched:sched_switch, sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup and sched:sched_migrate_task) on the same architecture? Thanks in advance, Dmitry ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-15 15:27 Perf record format portability Dmitry Antipov @ 2012-05-15 15:51 ` Arnaldo Carvalho de Melo 2012-05-16 10:50 ` Dmitry Antipov 0 siblings, 1 reply; 18+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-05-15 15:51 UTC (permalink / raw) To: Dmitry Antipov Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu: > Hello, > > are there any thoughts on how much of the perf.data is portable and how much it should be? > I'm interesting in recording scheduler activity on one machine and then replaying on > another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it > work with a small subset of recorded events (for example, sched:sched_switch, > sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup > and sched:sched_migrate_task) on the same architecture? Endianness issues? ARM EB? There are some patches by Jiri Olsa that may help you if that is the case. It should be portable, are you using 'perf archive' too? What exactly is the error experienced? - Arnaldo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-15 15:51 ` Arnaldo Carvalho de Melo @ 2012-05-16 10:50 ` Dmitry Antipov 2012-05-16 14:59 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 18+ messages in thread From: Dmitry Antipov @ 2012-05-16 10:50 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote: > Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu: >> Hello, >> >> are there any thoughts on how much of the perf.data is portable and how much it should be? >> I'm interesting in recording scheduler activity on one machine and then replaying on >> another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it >> work with a small subset of recorded events (for example, sched:sched_switch, >> sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup >> and sched:sched_migrate_task) on the same architecture? > > Endianness issues? ARM EB? There are some patches by Jiri Olsa that may > help you if that is the case. Thanks, will look at. > It should be portable, are you using 'perf archive' too? It doesn't work with cryptic messages like: tar: .build-id/17/d6ca02b2c31df54bf62a4142c47e3c99a9eedf: Cannot stat: No such file or directory creating empty archive. > What exactly is the error experienced? Now I'm facing the simple problem with event IDs, which may be different from machine to machine. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board and 279 on my PC host, so 'perf report' displays all event names like "unknown:unknown", even with --kallsyms=XXX where XXX is 'cat /proc/kallsyms > XXX' from PC host. Dmitry ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 10:50 ` Dmitry Antipov @ 2012-05-16 14:59 ` Arnaldo Carvalho de Melo 2012-05-16 15:16 ` Jiri Olsa 2012-05-16 16:58 ` Steven Rostedt 0 siblings, 2 replies; 18+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-05-16 14:59 UTC (permalink / raw) To: Dmitry Antipov Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa, Steven Rostedt Adding Jiri and Steven to the CC list. Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu: > On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote: > >Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu: > >>are there any thoughts on how much of the perf.data is portable and how much it should be? > >>I'm interesting in recording scheduler activity on one machine and then replaying on > >>another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it > >>work with a small subset of recorded events (for example, sched:sched_switch, > >>sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup > >>and sched:sched_migrate_task) on the same architecture? > > > >Endianness issues? ARM EB? There are some patches by Jiri Olsa that may > >help you if that is the case. > > Thanks, will look at. > > >It should be portable, are you using 'perf archive' too? > > It doesn't work with cryptic messages like: > > tar: .build-id/17/d6ca02b2c31df54bf62a4142c47e3c99a9eedf: Cannot stat: No such file or directory It is a shell script, basically, after yum collect your events with something like: [acme@sandy ~]$ perf record -F 10000 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (~917 samples) ] The resulting perf.data file will have samples taken on these DSOs, with those respective hashes identifying each one: [acme@sandy ~]$ perf buildid-list 4390a3d2dc84c37a8923ba4c910d6766abc42cbf [kernel.kallsyms] ceb82e745b0ab8bb7ea28c068327be1fb068c923 /lib64/ld-2.12.so e731c64000993d1fd1b443e6d5d6972d149440e8 /lib64/libc-2.12.so [acme@sandy ~]$ In your case we can see that it is looking for build id 17d6ca02b2c31df54bf62a4142c47e3c99a9eedf on the build id cache. Probably you either are running 'perf archive' on a different machine than the one where you ran 'perf record' or using a different user on the same machine, or, unlikely, perhaps you removed ~/.debug/ after 'record'. The 'perf archive' tool was done quickly just as a proof of concept, admitedly it needs to be improved to help diagnosing these problems. > creating empty archive. > > >What exactly is the error experienced? > > Now I'm facing the simple problem with event IDs, which may be different from machine to > machine. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM > board and 279 on my PC host, so 'perf report' displays all event names like "unknown:unknown", > even with --kallsyms=XXX where XXX is 'cat /proc/kallsyms > XXX' from PC host. With build-ids and 'perf archive' you shouldn't need specifying kallsyms, it has a build-id and will be collected (record + archive) an then transfered and expanded on the analysis machine (scp + tar xvf). The tracing part even stashes a copy of kallsyms in perf.data (not needed, but there for historical reasons). The problem is in translating the perf_event_attr.config to the same name and format as in the machine where you collected the events.` Steve, Was the kernel trace events infrastructure designed with that in mind? I.e. cross analysis? I must be missing something here, still ENOCOFFEE :-\ When doing cross arch event analisys I tested: PERF_TYPE_HARDWARE = 0, PERF_TYPE_SOFTWARE = 1, PERF_TYPE_HW_CACHE = 3, Not: PERF_TYPE_TRACEPOINT = 2, PERF_TYPE_RAW = 4, PERF_TYPE_BREAKPOINT = 5, - Arnaldo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 14:59 ` Arnaldo Carvalho de Melo @ 2012-05-16 15:16 ` Jiri Olsa 2012-05-16 15:50 ` Arnaldo Carvalho de Melo 2012-05-16 16:58 ` Steven Rostedt 1 sibling, 1 reply; 18+ messages in thread From: Jiri Olsa @ 2012-05-16 15:16 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Steven Rostedt On Wed, May 16, 2012 at 11:59:27AM -0300, Arnaldo Carvalho de Melo wrote: > Adding Jiri and Steven to the CC list. > > Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu: > > On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote: > > >Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu: > > >>are there any thoughts on how much of the perf.data is portable and how much it should be? > > >>I'm interesting in recording scheduler activity on one machine and then replaying on > > >>another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it > > >>work with a small subset of recorded events (for example, sched:sched_switch, > > >>sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup > > >>and sched:sched_migrate_task) on the same architecture? > > > > > >Endianness issues? ARM EB? There are some patches by Jiri Olsa that may > > >help you if that is the case. latest version sent today, there's description of tests I did: http://marc.info/?l=linux-kernel&m=133715172512742&w=2 Each time I run new sort of test, another endianity issue is hit. so, tracepoints.. I'll check ;) jirka ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 15:16 ` Jiri Olsa @ 2012-05-16 15:50 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 18+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-05-16 15:50 UTC (permalink / raw) To: Jiri Olsa Cc: Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Steven Rostedt Em Wed, May 16, 2012 at 05:16:55PM +0200, Jiri Olsa escreveu: > On Wed, May 16, 2012 at 11:59:27AM -0300, Arnaldo Carvalho de Melo wrote: > > Adding Jiri and Steven to the CC list. > > > > Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu: > > > On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote: > > > >Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu: > > > >>are there any thoughts on how much of the perf.data is portable and how much it should be? > > > >>I'm interesting in recording scheduler activity on one machine and then replaying on > > > >>another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it > > > >>work with a small subset of recorded events (for example, sched:sched_switch, > > > >>sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup > > > >>and sched:sched_migrate_task) on the same architecture? > > > > > > > >Endianness issues? ARM EB? There are some patches by Jiri Olsa that may > > > >help you if that is the case. > > latest version sent today, there's description of tests I did: > http://marc.info/?l=linux-kernel&m=133715172512742&w=2 > > Each time I run new sort of test, another endianity issue is hit. > so, tracepoints.. I'll check ;) The tracepoints part is a different problem, I think, but take a look anyway ;-) - Arnaldo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 14:59 ` Arnaldo Carvalho de Melo 2012-05-16 15:16 ` Jiri Olsa @ 2012-05-16 16:58 ` Steven Rostedt 2012-05-16 17:48 ` Jiri Olsa ` (2 more replies) 1 sibling, 3 replies; 18+ messages in thread From: Steven Rostedt @ 2012-05-16 16:58 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote: > Steve, > > Was the kernel trace events infrastructure designed with that in > mind? I.e. cross analysis? I must be missing something here, still > ENOCOFFEE :-\ Yes, the libparsevents library was design for this from day one. That's why trace-cmd data file can be run on an ARM and read on x86, or PPC, or whatever. I did all my development testing against 32bit, 64bit and big and little endian. This was the case from the beginning. -- Steve > > When doing cross arch event analisys I tested: > > PERF_TYPE_HARDWARE = 0, > PERF_TYPE_SOFTWARE = 1, > PERF_TYPE_HW_CACHE = 3, > > Not: > > PERF_TYPE_TRACEPOINT = 2, > PERF_TYPE_RAW = 4, > PERF_TYPE_BREAKPOINT = 5, > > - Arnaldo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 16:58 ` Steven Rostedt @ 2012-05-16 17:48 ` Jiri Olsa 2012-05-16 19:32 ` Steven Rostedt 2012-05-16 18:08 ` Arnaldo Carvalho de Melo 2012-05-17 5:10 ` Dmitry Antipov 2 siblings, 1 reply; 18+ messages in thread From: Jiri Olsa @ 2012-05-16 17:48 UTC (permalink / raw) To: Steven Rostedt Cc: Arnaldo Carvalho de Melo, Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel On Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt wrote: > On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote: > > > Steve, > > > > Was the kernel trace events infrastructure designed with that in > > mind? I.e. cross analysis? I must be missing something here, still > > ENOCOFFEE :-\ > > Yes, the libparsevents library was design for this from day one. That's > why trace-cmd data file can be run on an ARM and read on x86, or PPC, or > whatever. I did all my development testing against 32bit, 64bit and big > and little endian. This was the case from the beginning. for ppc64(record) vs x86_64(report) I got following report on latest tip: [jolsa@dhcp-26-214 test]$ ../perf report > report.target Endianness of raw data not corrected! Warning: 718 samples with id not present in the header Warning: The perf.data file has no samples! for following record: perf record -a -e sched:sched_switch -e sched:sched_process_exit -e sched:sched_process_fork -e sched:sched_wakeup -- sleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.178 MB perf.data (~7781 samples) ] I haven't tried trace-cmd, but I guess let's wait for libparsevents perf integration then.. ;) jirka ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 17:48 ` Jiri Olsa @ 2012-05-16 19:32 ` Steven Rostedt 2012-05-16 19:39 ` Steven Rostedt 0 siblings, 1 reply; 18+ messages in thread From: Steven Rostedt @ 2012-05-16 19:32 UTC (permalink / raw) To: Jiri Olsa Cc: Arnaldo Carvalho de Melo, Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote: > for ppc64(record) vs x86_64(report) I got following report on latest tip: > > [jolsa@dhcp-26-214 test]$ ../perf report > report.target > Endianness of raw data not corrected! > Warning: > 718 samples with id not present in the header > Warning: > The perf.data file has no samples! > > for following record: > perf record -a -e sched:sched_switch -e sched:sched_process_exit -e sched:sched_process_fork -e sched:sched_wakeup -- sleep 10 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.178 MB perf.data (~7781 samples) ] > > I haven't tried trace-cmd, but I guess let's wait for libparsevents > perf integration then.. ;) > It's in perf. It just needs to be set up. Look at tools/perf/util/trace-event.h There's a bigendian() function, a "file_bigendian" and a "host_bigendian". If perf recorded what endian was used on the target, and saves that in the perf.dat file, all it needs to do is update the two variables. file_bigendian = recorded_endian; host_bigendian = bigendian(); 1 for big endian, 0 for little endian. Where host is the machine that is running the perf report or script. After that, all reads of the data in events uses one of the __data2host() macros to convert if necessary. Note, latest trace-cmd has put all these in a pevent struct descriptor, so that different files can be read at the same time, and these files can be from different endian (and bit size) machines. The global variables no longer exist. My patches, that I and Frederic posted previously, convert perf to use this descriptor so that perf could benefit and read multiple files too. -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 19:32 ` Steven Rostedt @ 2012-05-16 19:39 ` Steven Rostedt 2012-05-17 8:51 ` Jiri Olsa 0 siblings, 1 reply; 18+ messages in thread From: Steven Rostedt @ 2012-05-16 19:39 UTC (permalink / raw) To: Jiri Olsa Cc: Arnaldo Carvalho de Melo, Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel On Wed, 2012-05-16 at 15:32 -0400, Steven Rostedt wrote: > On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote: > > > for ppc64(record) vs x86_64(report) I got following report on latest tip: > > > > [jolsa@dhcp-26-214 test]$ ../perf report > report.target > > Endianness of raw data not corrected! > > Warning: > > 718 samples with id not present in the header > > Warning: > > The perf.data file has no samples! > > What does perf script give you. It looks like Frederic took my code for this when he ported the original parse-events over to perf. I see the setup of these variables in tools/perf/util/trace-event-read.c If you run 'perf script' on x86 from a ppc perf.dat file, do you still get the same errors? -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 19:39 ` Steven Rostedt @ 2012-05-17 8:51 ` Jiri Olsa 0 siblings, 0 replies; 18+ messages in thread From: Jiri Olsa @ 2012-05-17 8:51 UTC (permalink / raw) To: Steven Rostedt Cc: Arnaldo Carvalho de Melo, Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel On Wed, May 16, 2012 at 03:39:14PM -0400, Steven Rostedt wrote: > On Wed, 2012-05-16 at 15:32 -0400, Steven Rostedt wrote: > > On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote: > > > > > for ppc64(record) vs x86_64(report) I got following report on latest tip: > > > > > > [jolsa@dhcp-26-214 test]$ ../perf report > report.target > > > Endianness of raw data not corrected! > > > Warning: > > > 718 samples with id not present in the header > > > Warning: > > > The perf.data file has no samples! > > > > > What does perf script give you. It looks like Frederic took my code for > this when he ported the original parse-events over to perf. I see the > setup of these variables in tools/perf/util/trace-event-read.c > > If you run 'perf script' on x86 from a ppc perf.dat file, do you still > get the same errors? yes --- [jolsa@dhcp-26-214 test]$ ../perf script Endianness of raw data not corrected! Warning: 718 samples with id not present in the header # ======== # captured on: Wed May 16 19:53:13 2012 # hostname : ibm-js22-vios-02-lp1.rhts.eng.bos.redhat.com # os release : 2.6.32-270.el6.ppc64 # perf version : 2.6.32-270.el6.ppc64.debug # arch : ppc64 # nrcpus online : 8 # nrcpus avail : 8 # cpudesc : POWER6 (architected), altivec supported # cpuid : 62,769 # total memory : 6236992 kB # cmdline : /usr/bin/perf record -a -e sched:sched_switch -e # sched:sched_process_exit -e sched:sched_process_fork -e # sched:sched_wakeup -- sleep 10 # event : name = sched:sched_switch, type = 2, config = 0x22, config1 = # 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 97, 98, 99, # 100, 101, 102, 103, 104 } # event : name = sched:sched_process_exit, type = 2, config = 0x1b, # config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 105, # 106, 107, 108, 109, 110, 111, 112 } # event : name = sched:sched_process_fork, type = 2, config = 0x1d, # config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 113, # 114, 115, 116, 117, 118, 119, 120 } # event : name = sched:sched_wakeup, type = 2, config = 0x17, config1 = # 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 121, 122, 123, # 124, 125, 126, 127, 128 } # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # ======== # --- jirka ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 16:58 ` Steven Rostedt 2012-05-16 17:48 ` Jiri Olsa @ 2012-05-16 18:08 ` Arnaldo Carvalho de Melo 2012-05-16 18:17 ` Steven Rostedt 2012-05-17 5:10 ` Dmitry Antipov 2 siblings, 1 reply; 18+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-05-16 18:08 UTC (permalink / raw) To: Steven Rostedt Cc: Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa Em Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt escreveu: > On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote: > > Was the kernel trace events infrastructure designed with that in > > mind? I.e. cross analysis? I must be missing something here, still > > ENOCOFFEE :-\ > > Yes, the libparsevents library was design for this from day one. That's > why trace-cmd data file can be run on an ARM and read on x86, or PPC, or > whatever. I did all my development testing against 32bit, 64bit and big > and little endian. This was the case from the beginning. I need to look at the code, but how does it do this? Copy the relevant /sys/kernel/debug/events formats in the header and then instead of looking at /sys/... look at those? Does it still copy /proc/kallsyms? - Arnaldo ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 18:08 ` Arnaldo Carvalho de Melo @ 2012-05-16 18:17 ` Steven Rostedt 0 siblings, 0 replies; 18+ messages in thread From: Steven Rostedt @ 2012-05-16 18:17 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Dmitry Antipov, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa On Wed, 2012-05-16 at 15:08 -0300, Arnaldo Carvalho de Melo wrote: > Em Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt escreveu: > > On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote: > > > Was the kernel trace events infrastructure designed with that in > > > mind? I.e. cross analysis? I must be missing something here, still > > > ENOCOFFEE :-\ > > > > Yes, the libparsevents library was design for this from day one. That's > > why trace-cmd data file can be run on an ARM and read on x86, or PPC, or > > whatever. I did all my development testing against 32bit, 64bit and big > > and little endian. This was the case from the beginning. > > I need to look at the code, but how does it do this? Copy the relevant > /sys/kernel/debug/events formats in the header and then instead of > looking at /sys/... look at those? It does copy the events from .../debug/tracing/events. But it does cheat about the bits. To determine the size, it looks at /sys/kernel/debug/tracing/events/header_page and the field of "commit". On 32bit machines, that's 4bytes, and on 64bit, that's 8 bytes. For endianess, that is calculated on the machine that the recording is running on and stored in the file. The parse-events structure has a way to record the endianess and long size, for later retrieval. > > Does it still copy /proc/kallsyms? Yes it does. -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-16 16:58 ` Steven Rostedt 2012-05-16 17:48 ` Jiri Olsa 2012-05-16 18:08 ` Arnaldo Carvalho de Melo @ 2012-05-17 5:10 ` Dmitry Antipov 2012-05-17 11:48 ` Steven Rostedt 2 siblings, 1 reply; 18+ messages in thread From: Dmitry Antipov @ 2012-05-17 5:10 UTC (permalink / raw) To: Steven Rostedt Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa On 05/16/2012 08:58 PM, Steven Rostedt wrote: > On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote: > >> Steve, >> >> Was the kernel trace events infrastructure designed with that in >> mind? I.e. cross analysis? I must be missing something here, still >> ENOCOFFEE :-\ > > Yes, the libparsevents library was design for this from day one. That's > why trace-cmd data file can be run on an ARM and read on x86, or PPC, or > whatever. I did all my development testing against 32bit, 64bit and big > and little endian. This was the case from the beginning. I didn't face with big/little conversion issues, most probably both x86 and my ARM board are of the same (little) endian :-). But the original question was about event IDs. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board and 279 on my PC host, so 'perf report' displays "unknown:unknown" instead of expected "sched:sched_switch" when attempting to do some cross-analysis. I suppose that original event IDs should be preserved, either within perf.data or by providing the copy of original /sys/kernel/debug/tracing/*, much like it's done with --kallsyms to resolve kernel symbols. Dmitry ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-17 5:10 ` Dmitry Antipov @ 2012-05-17 11:48 ` Steven Rostedt 2012-05-18 5:48 ` Dmitry Antipov 0 siblings, 1 reply; 18+ messages in thread From: Steven Rostedt @ 2012-05-17 11:48 UTC (permalink / raw) To: Dmitry Antipov Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa On Thu, 2012-05-17 at 09:10 +0400, Dmitry Antipov wrote: > On 05/16/2012 08:58 PM, Steven Rostedt wrote: > > > On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote: > > > >> Steve, > >> > >> Was the kernel trace events infrastructure designed with that in > >> mind? I.e. cross analysis? I must be missing something here, still > >> ENOCOFFEE :-\ > > > > Yes, the libparsevents library was design for this from day one. That's > > why trace-cmd data file can be run on an ARM and read on x86, or PPC, or > > whatever. I did all my development testing against 32bit, 64bit and big > > and little endian. This was the case from the beginning. > > I didn't face with big/little conversion issues, most probably both x86 and > my ARM board are of the same (little) endian :-). > > But the original question was about event IDs. For example, > /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board > and 279 on my PC host, so 'perf report' displays "unknown:unknown" instead > of expected "sched:sched_switch" when attempting to do some cross-analysis. > I suppose that original event IDs should be preserved, either within perf.data > or by providing the copy of original /sys/kernel/debug/tracing/*, much like > it's done with --kallsyms to resolve kernel symbols. trace-cmd copies the entire /sys/kernel/debug/tracing/events directory into the data file (well it copies only the events you specify). I thought perf did the same. It should be using what's in the perf.dat file and not what's on the host. Again, perf report is not what uses the events from trace-cmd. It's perf script that does. If perf script works, then perf report needs to be fixed. But after it gets updated to use the latest libparse-events, which I have no idea when that will ever happen. -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-17 11:48 ` Steven Rostedt @ 2012-05-18 5:48 ` Dmitry Antipov 2012-05-29 15:10 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 18+ messages in thread From: Dmitry Antipov @ 2012-05-18 5:48 UTC (permalink / raw) To: Steven Rostedt Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa On 05/17/2012 03:48 PM, Steven Rostedt wrote: > trace-cmd copies the entire /sys/kernel/debug/tracing/events directory > into the data file (well it copies only the events you specify). > I thought perf did the same. It should be using what's in the perf.dat > file and not what's on the host. I found that 'perf script' and 'perf report' works differently, and I suppose 'perf script' is correct and 'perf report' isn't. What I'm doing on PC host is: 1) Collect data with: perf record -a -R -f -m 8192 -c 1 -e sched:sched_switch \ -e sched:sched_process_exit -e sched:sched_process_fork \ -e sched:sched_wakeup -e sched:sched_migrate_task [task] 2) Collect an output from 'perf script' and 'perf report', both looks great. 3) Copy perf.data and contents of /proc/kallsyms to ARM target. 4) Next, on ARM target: perf script --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] Looks good, all event names like 'sched_wakeup' or 'sched_switch' are shown. 5) Try: perf report --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] --stdio All event names are shown as 'unknown:unknown'. "Cross-replaying" (perf sched replay) looks broken too. Host results are: run measurement overhead: 260 nsecs sleep measurement overhead: 56109 nsecs the run test took 1000054 nsecs the sleep test took 1076170 nsecs nr_run_events: 246 nr_sleep_events: 257 nr_wakeup_events: 123 target-less wakeups: 27 task 0 ( <unknown>: 3440), nr_events: 33 task 1 ( kworker/0:0: 3227), nr_events: 15 task 2 ( <unknown>: 0), nr_events: 125 task 3 ( plugin-containe: 1769), nr_events: 13 task 4 ( ksoftirqd/0: 3), nr_events: 5 task 5 ( kworker/2:2: 2023), nr_events: 3 task 6 ( perf: 3441), nr_events: 200 task 7 ( migration/2: 3091), nr_events: 3 task 8 ( kworker/1:0: 3104), nr_events: 158 task 9 ( urxvt: 2952), nr_events: 95 task 10 ( ksoftirqd/2: 3093), nr_events: 3 ------------------------------------------------------------ #1 : 70.193, ravg: 70.19, cpu: 116.57 / 116.57 #2 : 70.607, ravg: 70.23, cpu: 116.61 / 116.58 #3 : 70.411, ravg: 70.25, cpu: 116.69 / 116.59 #4 : 70.386, ravg: 70.27, cpu: 116.72 / 116.60 #5 : 70.222, ravg: 70.26, cpu: 116.39 / 116.58 #6 : 70.361, ravg: 70.27, cpu: 116.40 / 116.56 #7 : 70.409, ravg: 70.28, cpu: 116.43 / 116.55 #8 : 70.368, ravg: 70.29, cpu: 116.50 / 116.55 #9 : 70.604, ravg: 70.32, cpu: 116.75 / 116.57 #10 : 70.578, ravg: 70.35, cpu: 116.79 / 116.59 Cross-replaying attempt is ('perf sched -i [perf.data from PC host] replay'): run measurement overhead: 8099 nsecs sleep measurement overhead: 159428 nsecs the run test took 998913 nsecs the sleep test took 1188048 nsecs nr_run_events: 0 nr_sleep_events: 0 nr_wakeup_events: 0 ------------------------------------------------------------ #1 : 0.058, ravg: 0.06, cpu: 0.00 / 0.00 #2 : 0.105, ravg: 0.06, cpu: 0.00 / 0.00 #3 : 0.027, ravg: 0.06, cpu: 0.00 / 0.00 #4 : 0.026, ravg: 0.06, cpu: 0.00 / 0.00 #5 : 0.035, ravg: 0.05, cpu: 0.00 / 0.00 #6 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00 #7 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00 #8 : 0.028, ravg: 0.05, cpu: 0.00 / 0.00 #9 : 0.029, ravg: 0.04, cpu: 0.00 / 0.00 #10 : 0.028, ravg: 0.04, cpu: 0.00 / 0.00 Dmitry ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-18 5:48 ` Dmitry Antipov @ 2012-05-29 15:10 ` Arnaldo Carvalho de Melo 2012-05-31 8:28 ` Dmitry Antipov 0 siblings, 1 reply; 18+ messages in thread From: Arnaldo Carvalho de Melo @ 2012-05-29 15:10 UTC (permalink / raw) To: Dmitry Antipov Cc: Steven Rostedt, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa Em Fri, May 18, 2012 at 09:48:26AM +0400, Dmitry Antipov escreveu: > On 05/17/2012 03:48 PM, Steven Rostedt wrote: > > >trace-cmd copies the entire /sys/kernel/debug/tracing/events directory > >into the data file (well it copies only the events you specify). > >I thought perf did the same. It should be using what's in the perf.dat > >file and not what's on the host. > > I found that 'perf script' and 'perf report' works differently, > and I suppose 'perf script' is correct and 'perf report' isn't. > > What I'm doing on PC host is: I haven't tested this, but libtraceevent is now in, perhaps it works for you now? Can you check? - Arnaldo > 1) Collect data with: > perf record -a -R -f -m 8192 -c 1 -e sched:sched_switch \ > -e sched:sched_process_exit -e sched:sched_process_fork \ > -e sched:sched_wakeup -e sched:sched_migrate_task [task] > 2) Collect an output from 'perf script' and 'perf report', both looks > great. > 3) Copy perf.data and contents of /proc/kallsyms to ARM target. > > 4) Next, on ARM target: > perf script --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] > Looks good, all event names like 'sched_wakeup' or 'sched_switch' are shown. > 5) Try: > perf report --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] --stdio > All event names are shown as 'unknown:unknown'. > > "Cross-replaying" (perf sched replay) looks broken too. > Host results are: > > run measurement overhead: 260 nsecs > sleep measurement overhead: 56109 nsecs > the run test took 1000054 nsecs > the sleep test took 1076170 nsecs > nr_run_events: 246 > nr_sleep_events: 257 > nr_wakeup_events: 123 > target-less wakeups: 27 > task 0 ( <unknown>: 3440), nr_events: 33 > task 1 ( kworker/0:0: 3227), nr_events: 15 > task 2 ( <unknown>: 0), nr_events: 125 > task 3 ( plugin-containe: 1769), nr_events: 13 > task 4 ( ksoftirqd/0: 3), nr_events: 5 > task 5 ( kworker/2:2: 2023), nr_events: 3 > task 6 ( perf: 3441), nr_events: 200 > task 7 ( migration/2: 3091), nr_events: 3 > task 8 ( kworker/1:0: 3104), nr_events: 158 > task 9 ( urxvt: 2952), nr_events: 95 > task 10 ( ksoftirqd/2: 3093), nr_events: 3 > ------------------------------------------------------------ > #1 : 70.193, ravg: 70.19, cpu: 116.57 / 116.57 > #2 : 70.607, ravg: 70.23, cpu: 116.61 / 116.58 > #3 : 70.411, ravg: 70.25, cpu: 116.69 / 116.59 > #4 : 70.386, ravg: 70.27, cpu: 116.72 / 116.60 > #5 : 70.222, ravg: 70.26, cpu: 116.39 / 116.58 > #6 : 70.361, ravg: 70.27, cpu: 116.40 / 116.56 > #7 : 70.409, ravg: 70.28, cpu: 116.43 / 116.55 > #8 : 70.368, ravg: 70.29, cpu: 116.50 / 116.55 > #9 : 70.604, ravg: 70.32, cpu: 116.75 / 116.57 > #10 : 70.578, ravg: 70.35, cpu: 116.79 / 116.59 > > Cross-replaying attempt is ('perf sched -i [perf.data from PC host] replay'): > > run measurement overhead: 8099 nsecs > sleep measurement overhead: 159428 nsecs > the run test took 998913 nsecs > the sleep test took 1188048 nsecs > nr_run_events: 0 > nr_sleep_events: 0 > nr_wakeup_events: 0 > ------------------------------------------------------------ > #1 : 0.058, ravg: 0.06, cpu: 0.00 / 0.00 > #2 : 0.105, ravg: 0.06, cpu: 0.00 / 0.00 > #3 : 0.027, ravg: 0.06, cpu: 0.00 / 0.00 > #4 : 0.026, ravg: 0.06, cpu: 0.00 / 0.00 > #5 : 0.035, ravg: 0.05, cpu: 0.00 / 0.00 > #6 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00 > #7 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00 > #8 : 0.028, ravg: 0.05, cpu: 0.00 / 0.00 > #9 : 0.029, ravg: 0.04, cpu: 0.00 / 0.00 > #10 : 0.028, ravg: 0.04, cpu: 0.00 / 0.00 > > Dmitry ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Perf record format portability 2012-05-29 15:10 ` Arnaldo Carvalho de Melo @ 2012-05-31 8:28 ` Dmitry Antipov 0 siblings, 0 replies; 18+ messages in thread From: Dmitry Antipov @ 2012-05-31 8:28 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Steven Rostedt, Peter Zijlstra, Paul Mackerras, Ingo Molnar, Amit Kucheria, linaro-dev, linux-kernel, Jiri Olsa [-- Attachment #1: Type: text/plain, Size: 428 bytes --] On 05/29/2012 07:10 PM, Arnaldo Carvalho de Melo wrote: > I haven't tested this, but libtraceevent is now in, perhaps it works for > you now? Can you check? It doesn't work. Attempt to do 'perf report' on ARM for the data collected on x86 shows 'unknown:unknown' for event names (see report_x86_on_ARM.txt), and 'perf report' on x86 for the data collected on ARM shows invalid event names (see report_ARM_on_x86.txt). Dmitry [-- Attachment #2: report_x86_on_ARM.txt --] [-- Type: text/plain, Size: 3097 bytes --] # ======== # captured on: Thu May 31 08:15:47 2012 # hostname : notebook # os release : 3.3.7-1.fc16.x86_64 # perf version : 3.4.9208.gaf56e0 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz # cpuid : GenuineIntel,6,42,7 # total memory : 3934652 kB # cmdline : /tmp/perf record -a -R -f -m 8192 -c 1 -e sched:sched_switch -e sched:sched_process_exit -e sched:sched_process_fork -e sched:sched_wakeup -e sched:sched_migrate_task /bin/ls -la / # event : name = sched:sched_switch, type = 2, config = 0x117, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 45, 46, 47, 48 } # event : name = sched:sched_process_exit, type = 2, config = 0x114, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 49, 50, 51, 52 } # event : name = sched:sched_process_fork, type = 2, config = 0x111, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 53, 54, 55, 56 } # event : name = sched:sched_wakeup, type = 2, config = 0x119, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 57, 58, 59, 60 } # event : name = sched:sched_migrate_task, type = 2, config = 0x116, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 61, 62, 63, 64 } # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # ======== # # Samples: 223 of event 'unknown:unknown' # Event count (approx.): 223 # # Overhead Command Shared Object Symbol # ........ ........... ................. .............. # 25.56% ls [kernel.kallsyms] [k] __schedule 25.11% swapper [kernel.kallsyms] [k] __schedule 24.66% kworker/1:1 [kernel.kallsyms] [k] __schedule 23.77% urxvt [kernel.kallsyms] [k] __schedule 0.45% perf [kernel.kallsyms] [k] __schedule 0.45% migration/2 [kernel.kallsyms] [k] __schedule # Samples: 1 of event 'unknown:unknown' # Event count (approx.): 1 # # Overhead Command Shared Object Symbol # ........ ....... ................. ........... # 100.00% ls [kernel.kallsyms] [k] do_exit # Samples: 0 of event 'unknown:unknown' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # # Samples: 138 of event 'unknown:unknown' # Event count (approx.): 138 # # Overhead Command Shared Object Symbol # ........ ........... ................. .................. # 60.87% ls [kernel.kallsyms] [k] ttwu_do_wakeup 38.41% kworker/1:1 [kernel.kallsyms] [k] ttwu_do_wakeup 0.72% perf [kernel.kallsyms] [k] ttwu_do_wakeup # Samples: 2 of event 'unknown:unknown' # Event count (approx.): 2 # # Overhead Command Shared Object Symbol # ........ ........... ................. ................ # 50.00% perf [kernel.kallsyms] [k] set_task_cpu 50.00% migration/2 [kernel.kallsyms] [k] set_task_cpu # # (For a higher level overview, try: perf report --sort comm,dso) # [-- Attachment #3: report_ARM_on_x86.txt --] [-- Type: text/plain, Size: 3105 bytes --] # ======== # captured on: Thu May 31 12:19:45 2012 # hostname : linaro-developer # os release : 3.4.0+ # perf version : 3.4.0 # arch : armv7l # nrcpus online : 2 # nrcpus avail : 2 # cpudesc : ARMv7 Processor rev 2 (v7l) # total memory : 1022872 kB # cmdline : /usr/bin/perf record -a -R -f -m 8192 -c 1 -e sched:sched_switch -e sched:sched_process_exit -e sched:sched_process_fork -e sched:sched_wakeup -e sched:sched_migrate_task /bin/ls -la / # event : name = sched:sched_switch, type = 2, config = 0x37, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 11, 12 } # event : name = sched:sched_process_exit, type = 2, config = 0x34, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 13, 14 } # event : name = sched:sched_process_fork, type = 2, config = 0x31, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 15, 16 } # event : name = sched:sched_wakeup, type = 2, config = 0x39, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 17, 18 } # event : name = sched:sched_migrate_task, type = 2, config = 0x36, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 19, 20 } # HEADER_CPU_TOPOLOGY info available, use -I to display # ======== # # Samples: 148 of event 'syscalls:sys_exit_unshare' # Event count (approx.): 148 # # Overhead Command Shared Object Symbol # ........ ........... ................ ...... # 38.51% ls [unknown] [.] 0000000000000000 37.84% kworker/1:1 [unknown] [.] 0000000000000000 12.16% swapper [unknown] [.] 0000000000000000 10.81% sshd [unknown] [.] 0000000000000000 0.68% perf [unknown] [.] 0000000000000000 # Samples: 1 of event 'raw_syscalls:sys_exit' # Event count (approx.): 1 # # Overhead Command Shared Object Symbol # ........ ....... ................ ...... # 100.00% ls [unknown] [.] 0000000000000000 # Samples: 0 of event 'syscalls:sys_exit_mmap' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # # Samples: 103 of event 'syscalls:sys_exit_set_tid_address' # Event count (approx.): 103 # # Overhead Command Shared Object Symbol # ........ ........... ................ ...... # 80.58% ls [unknown] [.] 0000000000000000 15.53% kworker/1:1 [unknown] [.] 0000000000000000 1.94% sshd [unknown] [.] 0000000000000000 0.97% swapper [unknown] [.] 0000000000000000 0.97% perf [unknown] [.] 0000000000000000 # Samples: 1 of event 'mce:mce_record' # Event count (approx.): 1 # # Overhead Command Shared Object Symbol # ........ ....... ................ ...... # 100.00% swapper [unknown] [.] 0000000000000000 # # (For a higher level overview, try: perf report --sort comm,dso) # ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2012-05-31 8:26 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-05-15 15:27 Perf record format portability Dmitry Antipov 2012-05-15 15:51 ` Arnaldo Carvalho de Melo 2012-05-16 10:50 ` Dmitry Antipov 2012-05-16 14:59 ` Arnaldo Carvalho de Melo 2012-05-16 15:16 ` Jiri Olsa 2012-05-16 15:50 ` Arnaldo Carvalho de Melo 2012-05-16 16:58 ` Steven Rostedt 2012-05-16 17:48 ` Jiri Olsa 2012-05-16 19:32 ` Steven Rostedt 2012-05-16 19:39 ` Steven Rostedt 2012-05-17 8:51 ` Jiri Olsa 2012-05-16 18:08 ` Arnaldo Carvalho de Melo 2012-05-16 18:17 ` Steven Rostedt 2012-05-17 5:10 ` Dmitry Antipov 2012-05-17 11:48 ` Steven Rostedt 2012-05-18 5:48 ` Dmitry Antipov 2012-05-29 15:10 ` Arnaldo Carvalho de Melo 2012-05-31 8:28 ` Dmitry Antipov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox