* perf report for .ko files
@ 2010-01-26 0:34 john smith
2010-01-26 19:21 ` Peter Zijlstra
0 siblings, 1 reply; 8+ messages in thread
From: john smith @ 2010-01-26 0:34 UTC (permalink / raw)
To: linux-kernel
How can "perf report" be used to get info on (only) the specific manually loadable modules?
Driver entry points are reported but without full breakdown (on all the calls).
If "--symbols" option is used, most of the symbols are not found.
("--dsos" doesn't seem to help)
John
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: perf report for .ko files 2010-01-26 0:34 perf report for .ko files john smith @ 2010-01-26 19:21 ` Peter Zijlstra 2010-01-26 21:05 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 8+ messages in thread From: Peter Zijlstra @ 2010-01-26 19:21 UTC (permalink / raw) To: john smith; +Cc: linux-kernel, Arnaldo Carvalho de Melo On Mon, 2010-01-25 at 16:34 -0800, john smith wrote: > How can "perf report" be used to get info on (only) the specific manually loadable modules? > Driver entry points are reported but without full breakdown (on all the calls). Since its sampling it could be a function is just not hit, another possibility is inlining of functions, we simply cannot see inlined functions. > If "--symbols" option is used, most of the symbols are not found. > ("--dsos" doesn't seem to help) I'm not a big module user, but I think --dsos should work, if not feel free to send a patch to rectify this situation. --dsos does appear to work on my laptop (where I have iwlagn as a module since it sometimes needs a reload to start working, which is hard when build in). perf report --dsos "[iwlagn]" does indeed give me all iwlagn hits, albeit without symbol information, not sure why that is, /proc/kallsyms does seem to include some iwlagn symbols. Arnaldo might have a clue. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf report for .ko files 2010-01-26 19:21 ` Peter Zijlstra @ 2010-01-26 21:05 ` Arnaldo Carvalho de Melo 2010-01-29 18:55 ` john smith 0 siblings, 1 reply; 8+ messages in thread From: Arnaldo Carvalho de Melo @ 2010-01-26 21:05 UTC (permalink / raw) To: Peter Zijlstra; +Cc: john smith, linux-kernel Em Tue, Jan 26, 2010 at 08:21:55PM +0100, Peter Zijlstra escreveu: > On Mon, 2010-01-25 at 16:34 -0800, john smith wrote: > > How can "perf report" be used to get info on (only) the specific manually loadable modules? > > Driver entry points are reported but without full breakdown (on all the calls). > > Since its sampling it could be a function is just not hit, another > possibility is inlining of functions, we simply cannot see inlined > functions. Well, you could reduce the number of samples collected by asking perf record to include only kernel samples by fiddling with these perf_event_attr fields: exclude_user : 1, /* don't count user */ exclude_kernel : 1, /* ditto kernel */ exclude_hv : 1, /* ditto hypervisor */ exclude_idle : 1, /* don't count when idle */ I.e. setting exclude_user, exclude_hv and exclude_idle to 1, but this requires a patch for tools/perf/builtin-record.c as this is not exposed yet. Something similar to what we have in 'perf top' -K, --hide_kernel_symbols hide kernel symbols -U, --hide_user_symbols hide user symbols And then increase the frequency. > > If "--symbols" option is used, most of the symbols are not found. > > ("--dsos" doesn't seem to help) > > I'm not a big module user, but I think --dsos should work, if not feel > free to send a patch to rectify this situation. --dsos does work for kernel modules, yes: [root@doppio linux-2.6-tip]# perf report --dsos '[e1000e]' # dso: [e1000e] # Samples: 110518812 # # Overhead Command Symbol # ........ ............... ...... # 22.98% init [k] e1000_clean 21.52% mutt [k] e1000_clean 17.95% mutt [k] e1000_intr_msi 16.60% init [k] e1000_intr_msi 11.82% events/0 [k] e1000e_update_stats 1.80% mutt [k] e1000_put_txbuf 1.79% mutt [k] e1000_clean_tx_irq 1.28% init [k] e1000_clean_tx_irq 1.14% init [k] e1000_clean_rx_irq 1.02% sshd [k] e1000_xmit_frame 0.52% init [k] pci_unmap_single 0.43% openvpn [k] e1000_xmit_frame 0.29% sshd [k] e1000_intr_msi 0.18% init [k] e1000_update_itr 0.15% sshd [k] pci_dma_mapping_error 0.15% init [k] pci_dma_mapping_error 0.14% init [k] e1000_alloc_rx_buffers 0.13% sshd [k] e1000_clean 0.11% init [k] e1000_rx_checksum # # (For a higher level overview, try: perf report --sort comm,dso) # [root@doppio linux-2.6-tip]# > --dsos does appear to work on my laptop (where I have iwlagn as a module > since it sometimes needs a reload to start working, which is hard when > build in). > > perf report --dsos "[iwlagn]" yup, exactly like my example above. > does indeed give me all iwlagn hits, albeit without symbol information, > not sure why that is, /proc/kallsyms does seem to include some iwlagn > symbols. but not all, in verbose mode we can see from where it gets the module symbol information: [root@doppio linux-2.6-tip]# perf report --verbose --dsos '[e1000e]' | head -10 Looking at the vmlinux_path (5 entries long) No build_id in vmlinux, ignoring it No build_id in /boot/vmlinux, ignoring it No build_id in /boot/vmlinux-2.6.33-rc5-tip+, ignoring it Using /lib/modules/2.6.33-rc5-tip+/build/vmlinux for symbols # dso: [e1000e] # Samples: 110518812 # # Overhead Command Symbol # ........ ............... ...... # 22.98% init 0x000000000001088c B [k] e1000_clean 21.52% mutt 0x000000000001088c B [k] e1000_clean 17.95% mutt 0x000000000000db5c B [k] e1000_intr_msi 16.60% init 0x000000000000db5c B [k] e1000_intr_msi [root@doppio linux-2.6-tip]# In my case, all came from the buildid-cache, the ' B ' above: from tools/perf/util/symbol.c (should be in the man page, consider submitting a patch for that :)): char dso__symtab_origin(const struct dso *self) { static const char origin[] = { [DSO__ORIG_KERNEL] = 'k', [DSO__ORIG_JAVA_JIT] = 'j', [DSO__ORIG_BUILD_ID_CACHE] = 'B', [DSO__ORIG_FEDORA] = 'f', [DSO__ORIG_UBUNTU] = 'u', [DSO__ORIG_BUILDID] = 'b', [DSO__ORIG_DSO] = 'd', [DSO__ORIG_KMODULE] = 'K', }; if (self == NULL || self->origin == DSO__ORIG_NOT_FOUND) return '!'; return origin[self->origin]; } For further info: Use 'perf buildid-list' to get the build-id for the kernel module used at 'perf record' time. [root@doppio linux-2.6-tip]# perf buildid-list | grep e1000e 05c45e33bc92bf4bbc71ddde4128a9f5f1463fcb /lib/modules/2.6.33-rc5-tip+/kernel/drivers/net/e1000e/e1000e.ko Then look it up in the build-id cache: [root@doppio linux-2.6-tip]# l ~/.debug/.build-id/05/c45e33bc92bf4bbc71ddde4128a9f5f1463fcb lrwxrwxrwx 1 root root 110 2010-01-22 12:12 /root/.debug/.build-id/05/c45e33bc92bf4bbc71ddde4128a9f5f1463fcb -> ../../lib/modules/2.6.33-rc5-tip+/kernel/drivers/net/e1000e/e1000e.ko/05c45e33bc92bf4bbc71ddde4128a9f5f1463fcb [root@doppio linux-2.6-tip]# So indeed, at 'perf record' time we found that the file that matched the loaded e1000e.ko file was: /lib/modules/2.6.33-rc5-tip+/kernel/drivers/net/e1000e/e1000e.ko by looking at its build-id ELF note and at: /sys/module/e1000e/sections/.note.gnu.build-id for the build-id in the e1000e.ko loaded at that time. > Arnaldo might have a clue. See above :-) - Arnaldo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf report for .ko files 2010-01-26 21:05 ` Arnaldo Carvalho de Melo @ 2010-01-29 18:55 ` john smith 2010-01-29 19:15 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 8+ messages in thread From: john smith @ 2010-01-29 18:55 UTC (permalink / raw) To: Peter Zijlstra, Arnaldo Carvalho de Melo; +Cc: linux-kernel Problem1: > Well, you could reduce the number of samples collected by > asking perf record to > include only kernel samples by fiddling with these > perf_event_attr fields: > > > exclude_user : 1, /* don't count user */ > exclude_kernel : 1, /* ditto kernel */ > exclude_hv : 1, /* ditto hypervisor */ > exclude_idle : 1, /* don't count when idle */ > > I.e. setting exclude_user, exclude_hv and exclude_idle to > 1, but this requires > a patch for tools/perf/builtin-record.c as this is not > exposed yet. I did setup attr->exclude_hv, exclude_idle, exclude_user = 1 in create_counter() just before "try_again:" label, no difference in results, on a 2.6.31 (too old? "perf buildid-list" is not available) kernel version: # perf record -f -e cycles fio fio_script (for more than a minute, I increased the sample frequency with different -c values, some locked the machine - expectedly) and # perf report --verbose --dso=/my_module_path/my_module.ko (--verbose doesn't provide any more info) return the same "# Samples: 0": " # dso: /my_module_path/my_module.ko # Samples: 0 #... " > [root@doppio linux-2.6-tip]# perf report --dsos '[e1000e]' > # dso: [e1000e] > # Samples: 110518812 'e1000e' module may be be sampled more often than the scsi modules but ... > > not sure why that is, /proc/kallsyms does seem to > > include some iwlagn symbols. /proc/kallsyms includes all syms I need, so is a problem not finding the symbols, does this (Samples: 0) make sens? "perf report --sort comm,dso,symbol" report has my module symbols. Problem2: If I try "perf annotate -l -k /my_module_path/my_module.ko my_symbol" has all the percentages "0.00" where I would expect a % breakdown (considering the total function counter 100%), even if the function represents a small overall counter % compared to the others, any ideas? John ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf report for .ko files 2010-01-29 18:55 ` john smith @ 2010-01-29 19:15 ` Arnaldo Carvalho de Melo 2010-01-29 21:59 ` john smith 2010-01-30 6:45 ` Mike Galbraith 0 siblings, 2 replies; 8+ messages in thread From: Arnaldo Carvalho de Melo @ 2010-01-29 19:15 UTC (permalink / raw) To: john smith; +Cc: Peter Zijlstra, linux-kernel Em Fri, Jan 29, 2010 at 10:55:13AM -0800, john smith escreveu: > Problem1: > > > Well, you could reduce the number of samples collected by > > asking perf record to > > include only kernel samples by fiddling with these > > perf_event_attr fields: > > > > exclude_user : 1, /* don't count user */ > > exclude_kernel : 1, /* ditto kernel */ > > exclude_hv : 1, /* ditto hypervisor */ > > exclude_idle : 1, /* don't count when idle */ > > > > I.e. setting exclude_user, exclude_hv and exclude_idle to > > 1, but this requires > > a patch for tools/perf/builtin-record.c as this is not > > exposed yet. > I did setup attr->exclude_hv, exclude_idle, exclude_user = 1 in > create_counter() just before "try_again:" label, no difference in > results, on a 2.6.31 (too old? "perf buildid-list" is not available) > kernel version: > # perf record -f -e cycles fio fio_script > (for more than a minute, I increased the sample frequency with > different -c values, some locked the machine - expectedly) > and > # perf report --verbose --dso=/my_module_path/my_module.ko > (--verbose doesn't provide any more info) This isn't going to work, you need to do: perf report --verbose --dso='[my_module]' I.e. as it appear on /proc/modules + brackets. Arguably we need to get it to work also with both 'my_module.ko' and the full path. > return the same "# Samples: 0": > > " > # dso: /my_module_path/my_module.ko > # Samples: 0 > #... > " > > > [root@doppio linux-2.6-tip]# perf report --dsos '[e1000e]' > > # dso: [e1000e] > > # Samples: 110518812 > > 'e1000e' module may be be sampled more often than the scsi modules but ... > > > > not sure why that is, /proc/kallsyms does seem to > > > include some iwlagn symbols. > /proc/kallsyms includes all syms I need, so is a problem not finding > the symbols, does this (Samples: 0) make sens? > "perf report --sort comm,dso,symbol" report has my module symbols. See above on how to specify the module name, its just that it is finding no samples for the module name as specified, it seems. > Problem2: > If I try "perf annotate -l -k /my_module_path/my_module.ko my_symbol" > has all the percentages "0.00" where I would expect a % breakdown > (considering the total function counter 100%), even if the function > represents a small overall counter % compared to the others, any > ideas? I actually never tried doing annotation with a module, will try. - Arnaldo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf report for .ko files 2010-01-29 19:15 ` Arnaldo Carvalho de Melo @ 2010-01-29 21:59 ` john smith 2010-01-30 6:45 ` Mike Galbraith 1 sibling, 0 replies; 8+ messages in thread From: john smith @ 2010-01-29 21:59 UTC (permalink / raw) To: Arnaldo Carvalho de Melo; +Cc: Peter Zijlstra, linux-kernel > > I did setup attr->exclude_hv, exclude_idle, > exclude_user = 1 in > > create_counter() just before "try_again:" label, no > difference in > > results, on a 2.6.31 (too old? "perf buildid-list" is > not available) > > kernel version: > > > # perf record -f -e cycles fio fio_script > > > (for more than a minute, I increased the sample > frequency with > > different -c values, some locked the machine - > expectedly) > > and > > # perf report --verbose > --dso=/my_module_path/my_module.ko > > (--verbose doesn't provide any more info) > > This isn't going to work, you need to do: > > perf report --verbose --dso='[my_module]' > > I.e. as it appear on /proc/modules + brackets. Arguably we > need to get > it to work also with both 'my_module.ko' and the full > path. I've tried: "perf report --verbose --dso='[my_module]'" and also: "perf report --verbose --dso='[sg]'" same result as before. New question: Is this the minimal set of api that I can use from within the kernel (for specific module sections of interest) to read the counters? { int perf_event_release_kernel(struct perf_event *event); struct perf_event *perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu); u64 perf_event_read_value(struct perf_event *event); } Any good example of useage of these api's (struct initialization)? John ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf report for .ko files 2010-01-29 19:15 ` Arnaldo Carvalho de Melo 2010-01-29 21:59 ` john smith @ 2010-01-30 6:45 ` Mike Galbraith 2010-02-03 17:28 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 8+ messages in thread From: Mike Galbraith @ 2010-01-30 6:45 UTC (permalink / raw) To: Arnaldo Carvalho de Melo; +Cc: john smith, Peter Zijlstra, linux-kernel On Fri, 2010-01-29 at 17:15 -0200, Arnaldo Carvalho de Melo wrote: > I actually never tried doing annotation with a module, will try. Seems it went south again in both .git and .tip. 32 works. ed52ce2e3c33dc7626a40fa2da766d1a6460e543 is the first bad commit commit ed52ce2e3c33dc7626a40fa2da766d1a6460e543 Author: Arnaldo Carvalho de Melo <acme@redhat.com> Date: Mon Oct 19 17:17:57 2009 -0200 perf tools: Add ->unmap_ip operation to struct map We need this because we get section relative addresses when reading the symtabs, but when a tool like 'perf annotate' needs to match these address to what 'objdump -dS' produces we need the address + section back again. So in annotate now we look at the 'struct hist_entry' instances (that weren't really being used) so that we iterate only over the symbols that had some hit and get the map where that particular hit happened so that we can get the right address to match with annotate. Verified that at least: perf annotate mmap_read_counter # Uses the ~/bin/perf binary perf annotate --vmlinux /home/acme/git/build/perf/vmlinux intel_pmu_enable_all on a 'perf record perf top' session seems to work. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1255979877-12533-1-git-send-email-acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf report for .ko files 2010-01-30 6:45 ` Mike Galbraith @ 2010-02-03 17:28 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 8+ messages in thread From: Arnaldo Carvalho de Melo @ 2010-02-03 17:28 UTC (permalink / raw) To: Mike Galbraith; +Cc: john smith, Peter Zijlstra, linux-kernel Em Sat, Jan 30, 2010 at 07:45:54AM +0100, Mike Galbraith escreveu: > On Fri, 2010-01-29 at 17:15 -0200, Arnaldo Carvalho de Melo wrote: > > > I actually never tried doing annotation with a module, will try. > > Seems it went south again in both .git and .tip. 32 works. We need to read the ELF header, notice its a relocatable file and not apply the unmap_ip functions, doing a patch now. - Arnaldo ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-02-03 17:28 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-26 0:34 perf report for .ko files john smith 2010-01-26 19:21 ` Peter Zijlstra 2010-01-26 21:05 ` Arnaldo Carvalho de Melo 2010-01-29 18:55 ` john smith 2010-01-29 19:15 ` Arnaldo Carvalho de Melo 2010-01-29 21:59 ` john smith 2010-01-30 6:45 ` Mike Galbraith 2010-02-03 17:28 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox