* [PATCH] perf probe: Add fastpath to do lookup by function name @ 2011-03-24 15:38 Lin Ming 2011-03-24 7:58 ` Ingo Molnar 2011-03-24 9:08 ` Masami Hiramatsu 0 siblings, 2 replies; 11+ messages in thread From: Lin Ming @ 2011-03-24 15:38 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Masami Hiramatsu Cc: Peter Zijlstra, Ingo Molnar, linux-kernel The vmlinux file may have thousands of CUs. We can lookup function name from .debug_pubnames section to avoid the slow loop on CUs. Signed-off-by: Lin Ming <ming.m.lin@intel.com> --- tools/perf/util/probe-finder.c | 38 ++++++++++++++++++++++++++++++++++++++ tools/perf/util/probe-finder.h | 1 + 2 files changed, 39 insertions(+), 0 deletions(-) diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c index 194f9e2..b2034c2 100644 --- a/tools/perf/util/probe-finder.c +++ b/tools/perf/util/probe-finder.c @@ -1876,6 +1876,30 @@ static int find_line_range_by_func(struct line_finder *lf) return param.retval; } +static int pubname_search_cb(Dwarf *dbg, Dwarf_Global *gl, void *data) +{ + struct line_finder *lf = data; + struct line_range *lr = lf->lr; + + if (dwarf_offdie(dbg, gl->die_offset, &lf->sp_die)) { + if (dwarf_tag(&lf->sp_die) != DW_TAG_subprogram) + return DWARF_CB_OK; + + if (die_compare_name(&lf->sp_die, lr->function)) { + if (!dwarf_offdie(dbg, gl->cu_offset, &lf->cu_die)) + return DWARF_CB_OK; + + if (lr->file && !cu_find_realpath(&lf->cu_die, lr->file)) + return DWARF_CB_OK; + + lf->found = 1; + return DWARF_CB_ABORT; + } + } + + return DWARF_CB_OK; +} + int find_line_range(int fd, struct line_range *lr) { struct line_finder lf = {.lr = lr, .found = 0}; @@ -1895,6 +1919,19 @@ int find_line_range(int fd, struct line_range *lr) return -EBADF; } + /* Fastpath: lookup by function name from .debug_pubnames section */ + if (lr->function) { + struct dwarf_callback_param param = {.data = (void *)&lf, .retval = 0}; + + dwarf_getpubnames(dbg, pubname_search_cb, &lf, 0); + if (lf.found) { + lf.found = 0; + line_range_search_cb(&lf.sp_die, ¶m); + if (lf.found) + goto found; + } + } + /* Loop on CUs (Compilation Unit) */ while (!lf.found && ret >= 0) { if (dwarf_nextcu(dbg, off, &noff, &cuhl, NULL, NULL, NULL) != 0) @@ -1923,6 +1960,7 @@ int find_line_range(int fd, struct line_range *lr) off = noff; } +found: /* Store comp_dir */ if (lf.found) { comp_dir = cu_get_comp_dir(&lf.cu_die); diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h index beaefc3..4bc56a4 100644 --- a/tools/perf/util/probe-finder.h +++ b/tools/perf/util/probe-finder.h @@ -83,6 +83,7 @@ struct line_finder { int lno_s; /* Start line number */ int lno_e; /* End line number */ Dwarf_Die cu_die; /* Current CU */ + Dwarf_Die sp_die; int found; }; -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] perf probe: Add fastpath to do lookup by function name 2011-03-24 15:38 [PATCH] perf probe: Add fastpath to do lookup by function name Lin Ming @ 2011-03-24 7:58 ` Ingo Molnar 2011-03-24 8:38 ` Lin Ming 2011-03-24 9:08 ` Masami Hiramatsu 1 sibling, 1 reply; 11+ messages in thread From: Ingo Molnar @ 2011-03-24 7:58 UTC (permalink / raw) To: Lin Ming Cc: Arnaldo Carvalho de Melo, Masami Hiramatsu, Peter Zijlstra, linux-kernel * Lin Ming <ming.m.lin@intel.com> wrote: > The vmlinux file may have thousands of CUs. > We can lookup function name from .debug_pubnames section > to avoid the slow loop on CUs. Mind including before/after perf stat --repeat 10 results in the changelog, of an affected command which got faster? Thanks, Ingo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] perf probe: Add fastpath to do lookup by function name 2011-03-24 7:58 ` Ingo Molnar @ 2011-03-24 8:38 ` Lin Ming 2011-03-24 8:47 ` Ingo Molnar 0 siblings, 1 reply; 11+ messages in thread From: Lin Ming @ 2011-03-24 8:38 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, Masami Hiramatsu, Peter Zijlstra, linux-kernel On Thu, 2011-03-24 at 15:58 +0800, Ingo Molnar wrote: > * Lin Ming <ming.m.lin@intel.com> wrote: > > > The vmlinux file may have thousands of CUs. > > We can lookup function name from .debug_pubnames section > > to avoid the slow loop on CUs. > > Mind including before/after perf stat --repeat 10 results in the changelog, of > an affected command which got faster? >From 611a823a34e655d47a80f92994b004391e2b244c Mon Sep 17 00:00:00 2001 From: Lin Ming <ming.m.lin@intel.com> Date: Thu, 24 Mar 2011 23:22:24 +0800 Subject: [PATCH] perf probe: Add fastpath to do lookup by function name The vmlinux file may have thousands of CUs. We can lookup function name from .debug_pubnames section to avoid the slow loop on CUs. ./perf stat -r 10 -- ./perf probe -k /home/mlin/vmlinux \ -s /home/mlin/linux-2.6 \ --line csum_partial_copy_to_user > tmp.log before patch applied ===================== 364.535892 task-clock-msecs # 0.997 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 29,993 page-faults # 0.082 M/sec 865,862,109 cycles # 2375.245 M/sec 1,255,259,630 instructions # 1.450 IPC 252,400,884 branches # 692.390 M/sec 3,429,376 branch-misses # 1.359 % 1,386,990 cache-references # 3.805 M/sec 687,188 cache-misses # 1.885 M/sec 0.365792170 seconds time elapsed after patch applied ===================== 89.896405 task-clock-msecs # 0.991 CPUs 1 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 10,145 page-faults # 0.113 M/sec 214,553,875 cycles # 2386.679 M/sec 226,915,559 instructions # 1.058 IPC 44,536,614 branches # 495.422 M/sec 613,074 branch-misses # 1.377 % 860,787 cache-references # 9.575 M/sec 442,380 cache-misses # 4.921 M/sec 0.090716032 seconds time elapsed Signed-off-by: Lin Ming <ming.m.lin@intel.com> --- tools/perf/util/probe-finder.c | 38 ++++++++++++++++++++++++++++++++++++++ tools/perf/util/probe-finder.h | 1 + 2 files changed, 39 insertions(+), 0 deletions(-) diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c index 194f9e2..b2034c2 100644 --- a/tools/perf/util/probe-finder.c +++ b/tools/perf/util/probe-finder.c @@ -1876,6 +1876,30 @@ static int find_line_range_by_func(struct line_finder *lf) return param.retval; } +static int pubname_search_cb(Dwarf *dbg, Dwarf_Global *gl, void *data) +{ + struct line_finder *lf = data; + struct line_range *lr = lf->lr; + + if (dwarf_offdie(dbg, gl->die_offset, &lf->sp_die)) { + if (dwarf_tag(&lf->sp_die) != DW_TAG_subprogram) + return DWARF_CB_OK; + + if (die_compare_name(&lf->sp_die, lr->function)) { + if (!dwarf_offdie(dbg, gl->cu_offset, &lf->cu_die)) + return DWARF_CB_OK; + + if (lr->file && !cu_find_realpath(&lf->cu_die, lr->file)) + return DWARF_CB_OK; + + lf->found = 1; + return DWARF_CB_ABORT; + } + } + + return DWARF_CB_OK; +} + int find_line_range(int fd, struct line_range *lr) { struct line_finder lf = {.lr = lr, .found = 0}; @@ -1895,6 +1919,19 @@ int find_line_range(int fd, struct line_range *lr) return -EBADF; } + /* Fastpath: lookup by function name from .debug_pubnames section */ + if (lr->function) { + struct dwarf_callback_param param = {.data = (void *)&lf, .retval = 0}; + + dwarf_getpubnames(dbg, pubname_search_cb, &lf, 0); + if (lf.found) { + lf.found = 0; + line_range_search_cb(&lf.sp_die, ¶m); + if (lf.found) + goto found; + } + } + /* Loop on CUs (Compilation Unit) */ while (!lf.found && ret >= 0) { if (dwarf_nextcu(dbg, off, &noff, &cuhl, NULL, NULL, NULL) != 0) @@ -1923,6 +1960,7 @@ int find_line_range(int fd, struct line_range *lr) off = noff; } +found: /* Store comp_dir */ if (lf.found) { comp_dir = cu_get_comp_dir(&lf.cu_die); diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h index beaefc3..4bc56a4 100644 --- a/tools/perf/util/probe-finder.h +++ b/tools/perf/util/probe-finder.h @@ -83,6 +83,7 @@ struct line_finder { int lno_s; /* Start line number */ int lno_e; /* End line number */ Dwarf_Die cu_die; /* Current CU */ + Dwarf_Die sp_die; int found; }; -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] perf probe: Add fastpath to do lookup by function name 2011-03-24 8:38 ` Lin Ming @ 2011-03-24 8:47 ` Ingo Molnar 0 siblings, 0 replies; 11+ messages in thread From: Ingo Molnar @ 2011-03-24 8:47 UTC (permalink / raw) To: Lin Ming Cc: Arnaldo Carvalho de Melo, Masami Hiramatsu, Peter Zijlstra, linux-kernel * Lin Ming <ming.m.lin@intel.com> wrote: > ./perf stat -r 10 -- ./perf probe -k /home/mlin/vmlinux \ > -s /home/mlin/linux-2.6 \ > --line csum_partial_copy_to_user > tmp.log > > before patch applied > ===================== > 364.535892 task-clock-msecs # 0.997 CPUs > 865,862,109 cycles # 2375.245 M/sec > 1,255,259,630 instructions # 1.450 IPC > after patch applied > ===================== > 89.896405 task-clock-msecs # 0.991 CPUs > 214,553,875 cycles # 2386.679 M/sec > 226,915,559 instructions # 1.058 IPC That's a very nice speedup :-) Thanks, Ingo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] perf probe: Add fastpath to do lookup by function name 2011-03-24 15:38 [PATCH] perf probe: Add fastpath to do lookup by function name Lin Ming 2011-03-24 7:58 ` Ingo Molnar @ 2011-03-24 9:08 ` Masami Hiramatsu 2011-03-24 13:47 ` Lin Ming 1 sibling, 1 reply; 11+ messages in thread From: Masami Hiramatsu @ 2011-03-24 9:08 UTC (permalink / raw) To: Lin Ming Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, linux-kernel (2011/03/25 0:38), Lin Ming wrote: > The vmlinux file may have thousands of CUs. > We can lookup function name from .debug_pubnames section > to avoid the slow loop on CUs. > > Signed-off-by: Lin Ming <ming.m.lin@intel.com> > --- > tools/perf/util/probe-finder.c | 38 ++++++++++++++++++++++++++++++++++++++ > tools/perf/util/probe-finder.h | 1 + > 2 files changed, 39 insertions(+), 0 deletions(-) > > diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c > index 194f9e2..b2034c2 100644 > --- a/tools/perf/util/probe-finder.c > +++ b/tools/perf/util/probe-finder.c > @@ -1876,6 +1876,30 @@ static int find_line_range_by_func(struct line_finder *lf) > return param.retval; > } > > +static int pubname_search_cb(Dwarf *dbg, Dwarf_Global *gl, void *data) > +{ > + struct line_finder *lf = data; > + struct line_range *lr = lf->lr; > + > + if (dwarf_offdie(dbg, gl->die_offset, &lf->sp_die)) { > + if (dwarf_tag(&lf->sp_die) != DW_TAG_subprogram) > + return DWARF_CB_OK; > + > + if (die_compare_name(&lf->sp_die, lr->function)) { > + if (!dwarf_offdie(dbg, gl->cu_offset, &lf->cu_die)) > + return DWARF_CB_OK; > + Just one comment. Could you ensure that the decl_file of sp_die matches lr->file (by strtailcmp) here? Other parts look good to me:) Thanks! > + if (lr->file && !cu_find_realpath(&lf->cu_die, lr->file)) > + return DWARF_CB_OK; > + > + lf->found = 1; > + return DWARF_CB_ABORT; > + } > + } > + > + return DWARF_CB_OK; > +} > + > int find_line_range(int fd, struct line_range *lr) > { > struct line_finder lf = {.lr = lr, .found = 0}; > @@ -1895,6 +1919,19 @@ int find_line_range(int fd, struct line_range *lr) > return -EBADF; > } > > + /* Fastpath: lookup by function name from .debug_pubnames section */ > + if (lr->function) { > + struct dwarf_callback_param param = {.data = (void *)&lf, .retval = 0}; > + > + dwarf_getpubnames(dbg, pubname_search_cb, &lf, 0); > + if (lf.found) { > + lf.found = 0; > + line_range_search_cb(&lf.sp_die, ¶m); > + if (lf.found) > + goto found; > + } > + } > + > /* Loop on CUs (Compilation Unit) */ > while (!lf.found && ret >= 0) { > if (dwarf_nextcu(dbg, off, &noff, &cuhl, NULL, NULL, NULL) != 0) > @@ -1923,6 +1960,7 @@ int find_line_range(int fd, struct line_range *lr) > off = noff; > } > > +found: > /* Store comp_dir */ > if (lf.found) { > comp_dir = cu_get_comp_dir(&lf.cu_die); > diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h > index beaefc3..4bc56a4 100644 > --- a/tools/perf/util/probe-finder.h > +++ b/tools/perf/util/probe-finder.h > @@ -83,6 +83,7 @@ struct line_finder { > int lno_s; /* Start line number */ > int lno_e; /* End line number */ > Dwarf_Die cu_die; /* Current CU */ > + Dwarf_Die sp_die; > int found; > }; > -- Masami HIRAMATSU 2nd Dept. Linux Technology Center Hitachi, Ltd., Systems Development Laboratory E-mail: masami.hiramatsu.pt@hitachi.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] perf probe: Add fastpath to do lookup by function name 2011-03-24 9:08 ` Masami Hiramatsu @ 2011-03-24 13:47 ` Lin Ming 2011-03-24 14:09 ` [PATCH v2 -tip] " Lin Ming 0 siblings, 1 reply; 11+ messages in thread From: Lin Ming @ 2011-03-24 13:47 UTC (permalink / raw) To: Masami Hiramatsu Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, linux-kernel On Thu, 2011-03-24 at 17:08 +0800, Masami Hiramatsu wrote: > (2011/03/25 0:38), Lin Ming wrote: > > The vmlinux file may have thousands of CUs. > > We can lookup function name from .debug_pubnames section > > to avoid the slow loop on CUs. > > > > Signed-off-by: Lin Ming <ming.m.lin@intel.com> > > --- > > tools/perf/util/probe-finder.c | 38 ++++++++++++++++++++++++++++++++++++++ > > tools/perf/util/probe-finder.h | 1 + > > 2 files changed, 39 insertions(+), 0 deletions(-) > > > > diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c > > index 194f9e2..b2034c2 100644 > > --- a/tools/perf/util/probe-finder.c > > +++ b/tools/perf/util/probe-finder.c > > @@ -1876,6 +1876,30 @@ static int find_line_range_by_func(struct line_finder *lf) > > return param.retval; > > } > > > > +static int pubname_search_cb(Dwarf *dbg, Dwarf_Global *gl, void *data) > > +{ > > + struct line_finder *lf = data; > > + struct line_range *lr = lf->lr; > > + > > + if (dwarf_offdie(dbg, gl->die_offset, &lf->sp_die)) { > > + if (dwarf_tag(&lf->sp_die) != DW_TAG_subprogram) > > + return DWARF_CB_OK; > > + > > + if (die_compare_name(&lf->sp_die, lr->function)) { > > + if (!dwarf_offdie(dbg, gl->cu_offset, &lf->cu_die)) > > + return DWARF_CB_OK; > > + > > Just one comment. > Could you ensure that the decl_file of sp_die matches lr->file (by strtailcmp) here? OK, so the file name compare with cu_find_realpath(..) can be removed, as below. This makes the lookup a bit more faster again. Thanks, I'll post a new version with below changes. diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c index b2034c2..38e4a05 100644 --- a/tools/perf/util/probe-finder.c +++ b/tools/perf/util/probe-finder.c @@ -1880,6 +1880,7 @@ static int pubname_search_cb(Dwarf *dbg, Dwarf_Global *gl, void *data) { struct line_finder *lf = data; struct line_range *lr = lf->lr; + const char *file; if (dwarf_offdie(dbg, gl->die_offset, &lf->sp_die)) { if (dwarf_tag(&lf->sp_die) != DW_TAG_subprogram) @@ -1889,8 +1890,12 @@ static int pubname_search_cb(Dwarf *dbg, Dwarf_Global *gl, void *data) if (!dwarf_offdie(dbg, gl->cu_offset, &lf->cu_die)) return DWARF_CB_OK; - if (lr->file && !cu_find_realpath(&lf->cu_die, lr->file)) - return DWARF_CB_OK; + if (lr->file) { + file = dwarf_decl_file(&lf->sp_die); + + if (file && strtailcmp(file, lr->file)) + return DWARF_CB_OK; + } lf->found = 1; return DWARF_CB_ABORT; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 -tip] perf probe: Add fastpath to do lookup by function name 2011-03-24 13:47 ` Lin Ming @ 2011-03-24 14:09 ` Lin Ming 2011-03-25 1:14 ` Masami Hiramatsu 0 siblings, 1 reply; 11+ messages in thread From: Lin Ming @ 2011-03-24 14:09 UTC (permalink / raw) To: Masami Hiramatsu, Arnaldo Carvalho de Melo Cc: Peter Zijlstra, Ingo Molnar, linux-kernel v2 -> v1: - Don't compare file names with cu_find_realpath(...), instead, compare them with the name returned by dwarf_decl_file(sp_die) The vmlinux file may have thousands of CUs. We can lookup function name from .debug_pubnames section to avoid the slow loop on CUs. ./perf stat -r 10 -- ./perf probe -k /home/mlin/vmlinux \ -s /home/mlin/linux-2.6 \ --line csum_partial_copy_to_user > tmp.log before patch applied ===================== 364.535892 task-clock-msecs # 0.997 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 29,993 page-faults # 0.082 M/sec 865,862,109 cycles # 2375.245 M/sec 1,255,259,630 instructions # 1.450 IPC 252,400,884 branches # 692.390 M/sec 3,429,376 branch-misses # 1.359 % 1,386,990 cache-references # 3.805 M/sec 687,188 cache-misses # 1.885 M/sec 0.365792170 seconds time elapsed after patch applied ===================== 89.896405 task-clock-msecs # 0.991 CPUs 1 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 10,145 page-faults # 0.113 M/sec 214,553,875 cycles # 2386.679 M/sec 226,915,559 instructions # 1.058 IPC 44,536,614 branches # 495.422 M/sec 613,074 branch-misses # 1.377 % 860,787 cache-references # 9.575 M/sec 442,380 cache-misses # 4.921 M/sec 0.090716032 seconds time elapsed Signed-off-by: Lin Ming <ming.m.lin@intel.com> --- tools/perf/util/probe-finder.c | 39 +++++++++++++++++++++++++++++++++++++++ tools/perf/util/probe-finder.h | 1 + 2 files changed, 40 insertions(+), 0 deletions(-) diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c index 194f9e2..5cf044c 100644 --- a/tools/perf/util/probe-finder.c +++ b/tools/perf/util/probe-finder.c @@ -1876,6 +1876,31 @@ static int find_line_range_by_func(struct line_finder *lf) return param.retval; } +static int pubname_search_cb(Dwarf *dbg, Dwarf_Global *gl, void *data) +{ + struct line_finder *lf = data; + struct line_range *lr = lf->lr; + + if (dwarf_offdie(dbg, gl->die_offset, &lf->sp_die)) { + if (dwarf_tag(&lf->sp_die) != DW_TAG_subprogram) + return DWARF_CB_OK; + + if (die_compare_name(&lf->sp_die, lr->function)) { + if (!dwarf_offdie(dbg, gl->cu_offset, &lf->cu_die)) + return DWARF_CB_OK; + + if (lr->file && + strtailcmp(lr->file, dwarf_decl_file(&lf->sp_die))) + return DWARF_CB_OK; + + lf->found = 1; + return DWARF_CB_ABORT; + } + } + + return DWARF_CB_OK; +} + int find_line_range(int fd, struct line_range *lr) { struct line_finder lf = {.lr = lr, .found = 0}; @@ -1895,6 +1920,19 @@ int find_line_range(int fd, struct line_range *lr) return -EBADF; } + /* Fastpath: lookup by function name from .debug_pubnames section */ + if (lr->function) { + struct dwarf_callback_param param = {.data = (void *)&lf, .retval = 0}; + + dwarf_getpubnames(dbg, pubname_search_cb, &lf, 0); + if (lf.found) { + lf.found = 0; + line_range_search_cb(&lf.sp_die, ¶m); + if (lf.found) + goto found; + } + } + /* Loop on CUs (Compilation Unit) */ while (!lf.found && ret >= 0) { if (dwarf_nextcu(dbg, off, &noff, &cuhl, NULL, NULL, NULL) != 0) @@ -1923,6 +1961,7 @@ int find_line_range(int fd, struct line_range *lr) off = noff; } +found: /* Store comp_dir */ if (lf.found) { comp_dir = cu_get_comp_dir(&lf.cu_die); diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h index beaefc3..4bc56a4 100644 --- a/tools/perf/util/probe-finder.h +++ b/tools/perf/util/probe-finder.h @@ -83,6 +83,7 @@ struct line_finder { int lno_s; /* Start line number */ int lno_e; /* End line number */ Dwarf_Die cu_die; /* Current CU */ + Dwarf_Die sp_die; int found; }; -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 -tip] perf probe: Add fastpath to do lookup by function name 2011-03-24 14:09 ` [PATCH v2 -tip] " Lin Ming @ 2011-03-25 1:14 ` Masami Hiramatsu 2011-03-25 2:57 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 11+ messages in thread From: Masami Hiramatsu @ 2011-03-25 1:14 UTC (permalink / raw) To: Lin Ming Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, linux-kernel (2011/03/24 23:09), Lin Ming wrote: > v2 -> v1: > - Don't compare file names with cu_find_realpath(...), instead, compare them > with the name returned by dwarf_decl_file(sp_die) > > The vmlinux file may have thousands of CUs. > We can lookup function name from .debug_pubnames section > to avoid the slow loop on CUs. > > ./perf stat -r 10 -- ./perf probe -k /home/mlin/vmlinux \ > -s /home/mlin/linux-2.6 \ > --line csum_partial_copy_to_user > tmp.log > > before patch applied > ===================== > 364.535892 task-clock-msecs # 0.997 CPUs > 0 context-switches # 0.000 M/sec > 0 CPU-migrations # 0.000 M/sec > 29,993 page-faults # 0.082 M/sec > 865,862,109 cycles # 2375.245 M/sec > 1,255,259,630 instructions # 1.450 IPC > 252,400,884 branches # 692.390 M/sec > 3,429,376 branch-misses # 1.359 % > 1,386,990 cache-references # 3.805 M/sec > 687,188 cache-misses # 1.885 M/sec > > 0.365792170 seconds time elapsed > > after patch applied > ===================== > 89.896405 task-clock-msecs # 0.991 CPUs > 1 context-switches # 0.000 M/sec > 0 CPU-migrations # 0.000 M/sec > 10,145 page-faults # 0.113 M/sec > 214,553,875 cycles # 2386.679 M/sec > 226,915,559 instructions # 1.058 IPC > 44,536,614 branches # 495.422 M/sec > 613,074 branch-misses # 1.377 % > 860,787 cache-references # 9.575 M/sec > 442,380 cache-misses # 4.921 M/sec > > 0.090716032 seconds time elapsed Thanks! Looks very good :) Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> > > Signed-off-by: Lin Ming <ming.m.lin@intel.com> > --- > tools/perf/util/probe-finder.c | 39 +++++++++++++++++++++++++++++++++++++++ > tools/perf/util/probe-finder.h | 1 + > 2 files changed, 40 insertions(+), 0 deletions(-) > > diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c > index 194f9e2..5cf044c 100644 > --- a/tools/perf/util/probe-finder.c > +++ b/tools/perf/util/probe-finder.c > @@ -1876,6 +1876,31 @@ static int find_line_range_by_func(struct line_finder *lf) > return param.retval; > } > > +static int pubname_search_cb(Dwarf *dbg, Dwarf_Global *gl, void *data) > +{ > + struct line_finder *lf = data; > + struct line_range *lr = lf->lr; > + > + if (dwarf_offdie(dbg, gl->die_offset, &lf->sp_die)) { > + if (dwarf_tag(&lf->sp_die) != DW_TAG_subprogram) > + return DWARF_CB_OK; > + > + if (die_compare_name(&lf->sp_die, lr->function)) { > + if (!dwarf_offdie(dbg, gl->cu_offset, &lf->cu_die)) > + return DWARF_CB_OK; > + > + if (lr->file && > + strtailcmp(lr->file, dwarf_decl_file(&lf->sp_die))) > + return DWARF_CB_OK; > + > + lf->found = 1; > + return DWARF_CB_ABORT; > + } > + } > + > + return DWARF_CB_OK; > +} > + > int find_line_range(int fd, struct line_range *lr) > { > struct line_finder lf = {.lr = lr, .found = 0}; > @@ -1895,6 +1920,19 @@ int find_line_range(int fd, struct line_range *lr) > return -EBADF; > } > > + /* Fastpath: lookup by function name from .debug_pubnames section */ > + if (lr->function) { > + struct dwarf_callback_param param = {.data = (void *)&lf, .retval = 0}; > + > + dwarf_getpubnames(dbg, pubname_search_cb, &lf, 0); > + if (lf.found) { > + lf.found = 0; > + line_range_search_cb(&lf.sp_die, ¶m); > + if (lf.found) > + goto found; > + } > + } > + > /* Loop on CUs (Compilation Unit) */ > while (!lf.found && ret >= 0) { > if (dwarf_nextcu(dbg, off, &noff, &cuhl, NULL, NULL, NULL) != 0) > @@ -1923,6 +1961,7 @@ int find_line_range(int fd, struct line_range *lr) > off = noff; > } > > +found: > /* Store comp_dir */ > if (lf.found) { > comp_dir = cu_get_comp_dir(&lf.cu_die); > diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h > index beaefc3..4bc56a4 100644 > --- a/tools/perf/util/probe-finder.h > +++ b/tools/perf/util/probe-finder.h > @@ -83,6 +83,7 @@ struct line_finder { > int lno_s; /* Start line number */ > int lno_e; /* End line number */ > Dwarf_Die cu_die; /* Current CU */ > + Dwarf_Die sp_die; > int found; > }; > -- Masami HIRAMATSU 2nd Dept. Linux Technology Center Hitachi, Ltd., Systems Development Laboratory E-mail: masami.hiramatsu.pt@hitachi.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 -tip] perf probe: Add fastpath to do lookup by function name 2011-03-25 1:14 ` Masami Hiramatsu @ 2011-03-25 2:57 ` Arnaldo Carvalho de Melo 2011-03-25 6:33 ` Lin Ming 0 siblings, 1 reply; 11+ messages in thread From: Arnaldo Carvalho de Melo @ 2011-03-25 2:57 UTC (permalink / raw) To: Masami Hiramatsu; +Cc: Lin Ming, Peter Zijlstra, Ingo Molnar, linux-kernel Em Fri, Mar 25, 2011 at 10:14:25AM +0900, Masami Hiramatsu escreveu: > (2011/03/24 23:09), Lin Ming wrote: > > v2 -> v1: > > - Don't compare file names with cu_find_realpath(...), instead, compare them > > with the name returned by dwarf_decl_file(sp_die) > > > > The vmlinux file may have thousands of CUs. > > We can lookup function name from .debug_pubnames section > > to avoid the slow loop on CUs. > > > > ./perf stat -r 10 -- ./perf probe -k /home/mlin/vmlinux \ > > -s /home/mlin/linux-2.6 \ > > --line csum_partial_copy_to_user > tmp.log > > > > before patch applied > > ===================== > > 364.535892 task-clock-msecs # 0.997 CPUs > > 0 context-switches # 0.000 M/sec > > 0 CPU-migrations # 0.000 M/sec > > 29,993 page-faults # 0.082 M/sec > > 865,862,109 cycles # 2375.245 M/sec > > 1,255,259,630 instructions # 1.450 IPC > > 252,400,884 branches # 692.390 M/sec > > 3,429,376 branch-misses # 1.359 % > > 1,386,990 cache-references # 3.805 M/sec > > 687,188 cache-misses # 1.885 M/sec > > > > 0.365792170 seconds time elapsed > > > > after patch applied > > ===================== > > 89.896405 task-clock-msecs # 0.991 CPUs > > 1 context-switches # 0.000 M/sec > > 0 CPU-migrations # 0.000 M/sec > > 10,145 page-faults # 0.113 M/sec > > 214,553,875 cycles # 2386.679 M/sec > > 226,915,559 instructions # 1.058 IPC > > 44,536,614 branches # 495.422 M/sec > > 613,074 branch-misses # 1.377 % > > 860,787 cache-references # 9.575 M/sec > > 442,380 cache-misses # 4.921 M/sec > > > > 0.090716032 seconds time elapsed > > Thanks! Looks very good :) > > Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Indeed, I'll try and process this one tomorrow, Thanks a lot! - Arnaldo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 -tip] perf probe: Add fastpath to do lookup by function name 2011-03-25 2:57 ` Arnaldo Carvalho de Melo @ 2011-03-25 6:33 ` Lin Ming 2011-03-25 8:30 ` Lin Ming 0 siblings, 1 reply; 11+ messages in thread From: Lin Ming @ 2011-03-25 6:33 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu, Peter Zijlstra, Ingo Molnar, linux-kernel 在 2011-03-25五的 10:57 +0800,Arnaldo Carvalho de Melo写道: > Em Fri, Mar 25, 2011 at 10:14:25AM +0900, Masami Hiramatsu escreveu: > > (2011/03/24 23:09), Lin Ming wrote: > > > v2 -> v1: > > > - Don't compare file names with cu_find_realpath(...), instead, compare them > > > with the name returned by dwarf_decl_file(sp_die) > > > > > > The vmlinux file may have thousands of CUs. > > > We can lookup function name from .debug_pubnames section > > > to avoid the slow loop on CUs. > > > > > > ./perf stat -r 10 -- ./perf probe -k /home/mlin/vmlinux \ > > > -s /home/mlin/linux-2.6 \ > > > --line csum_partial_copy_to_user > tmp.log > > > > > > before patch applied > > > ===================== > > > 364.535892 task-clock-msecs # 0.997 CPUs > > > 0 context-switches # 0.000 M/sec > > > 0 CPU-migrations # 0.000 M/sec > > > 29,993 page-faults # 0.082 M/sec > > > 865,862,109 cycles # 2375.245 M/sec > > > 1,255,259,630 instructions # 1.450 IPC > > > 252,400,884 branches # 692.390 M/sec > > > 3,429,376 branch-misses # 1.359 % > > > 1,386,990 cache-references # 3.805 M/sec > > > 687,188 cache-misses # 1.885 M/sec > > > > > > 0.365792170 seconds time elapsed > > > > > > after patch applied > > > ===================== > > > 89.896405 task-clock-msecs # 0.991 CPUs > > > 1 context-switches # 0.000 M/sec > > > 0 CPU-migrations # 0.000 M/sec > > > 10,145 page-faults # 0.113 M/sec > > > 214,553,875 cycles # 2386.679 M/sec > > > 226,915,559 instructions # 1.058 IPC > > > 44,536,614 branches # 495.422 M/sec > > > 613,074 branch-misses # 1.377 % > > > 860,787 cache-references # 9.575 M/sec > > > 442,380 cache-misses # 4.921 M/sec > > > > > > 0.090716032 seconds time elapsed > > > > Thanks! Looks very good :) > > > > Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> > > Indeed, I'll try and process this one tomorrow, Except find_line_range, I just realized that the same optimization maybe added for find_probes. Thanks, Lin Ming > > Thanks a lot! > > - Arnaldo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 -tip] perf probe: Add fastpath to do lookup by function name 2011-03-25 6:33 ` Lin Ming @ 2011-03-25 8:30 ` Lin Ming 0 siblings, 0 replies; 11+ messages in thread From: Lin Ming @ 2011-03-25 8:30 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Masami Hiramatsu, Peter Zijlstra, Ingo Molnar, linux-kernel On Fri, 2011-03-25 at 14:33 +0800, Lin Ming wrote: > 在 2011-03-25五的 10:57 +0800,Arnaldo Carvalho de Melo写道: > > Em Fri, Mar 25, 2011 at 10:14:25AM +0900, Masami Hiramatsu escreveu: > > > (2011/03/24 23:09), Lin Ming wrote: > > > > v2 -> v1: > > > > - Don't compare file names with cu_find_realpath(...), instead, compare them > > > > with the name returned by dwarf_decl_file(sp_die) > > > > > > > > The vmlinux file may have thousands of CUs. > > > > We can lookup function name from .debug_pubnames section > > > > to avoid the slow loop on CUs. > > > > > > > > ./perf stat -r 10 -- ./perf probe -k /home/mlin/vmlinux \ > > > > -s /home/mlin/linux-2.6 \ > > > > --line csum_partial_copy_to_user > tmp.log > > > > > > > > before patch applied > > > > ===================== > > > > 364.535892 task-clock-msecs # 0.997 CPUs > > > > 0 context-switches # 0.000 M/sec > > > > 0 CPU-migrations # 0.000 M/sec > > > > 29,993 page-faults # 0.082 M/sec > > > > 865,862,109 cycles # 2375.245 M/sec > > > > 1,255,259,630 instructions # 1.450 IPC > > > > 252,400,884 branches # 692.390 M/sec > > > > 3,429,376 branch-misses # 1.359 % > > > > 1,386,990 cache-references # 3.805 M/sec > > > > 687,188 cache-misses # 1.885 M/sec > > > > > > > > 0.365792170 seconds time elapsed > > > > > > > > after patch applied > > > > ===================== > > > > 89.896405 task-clock-msecs # 0.991 CPUs > > > > 1 context-switches # 0.000 M/sec > > > > 0 CPU-migrations # 0.000 M/sec > > > > 10,145 page-faults # 0.113 M/sec > > > > 214,553,875 cycles # 2386.679 M/sec > > > > 226,915,559 instructions # 1.058 IPC > > > > 44,536,614 branches # 495.422 M/sec > > > > 613,074 branch-misses # 1.377 % > > > > 860,787 cache-references # 9.575 M/sec > > > > 442,380 cache-misses # 4.921 M/sec > > > > > > > > 0.090716032 seconds time elapsed > > > > > > Thanks! Looks very good :) > > > > > > Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> > > > > Indeed, I'll try and process this one tomorrow, > > Except find_line_range, I just realized that the same optimization maybe > added for find_probes. I have send out a v3 patch to add fastpath for find_probes. Thanks, Lin Ming > > Thanks, > Lin Ming > > > > > Thanks a lot! > > > > - Arnaldo > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-03-25 8:30 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-24 15:38 [PATCH] perf probe: Add fastpath to do lookup by function name Lin Ming 2011-03-24 7:58 ` Ingo Molnar 2011-03-24 8:38 ` Lin Ming 2011-03-24 8:47 ` Ingo Molnar 2011-03-24 9:08 ` Masami Hiramatsu 2011-03-24 13:47 ` Lin Ming 2011-03-24 14:09 ` [PATCH v2 -tip] " Lin Ming 2011-03-25 1:14 ` Masami Hiramatsu 2011-03-25 2:57 ` Arnaldo Carvalho de Melo 2011-03-25 6:33 ` Lin Ming 2011-03-25 8:30 ` Lin Ming
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox