* [PATCH v7] perf annotate: Fix sample events lost in stdio mode @ 2021-03-19 12:35 Yang Jihong 2021-03-26 2:24 ` Yang Jihong 0 siblings, 1 reply; 10+ messages in thread From: Yang Jihong @ 2021-03-19 12:35 UTC (permalink / raw) To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung, yao.jin, gustavoars, mliska, linux-kernel Cc: yangjihong1, zhangjinhao2 In hist__find_annotations function, since different hist_entry may point to same symbol, we free notes->src to signal already processed this symbol in stdio mode; when annotate, entry will skipped if notes->src is NULL to avoid repeated output. However, there is a problem, for example, run the following command: # perf record -e branch-misses -e branch-instructions -a sleep 1 perf.data file contains different types of sample event. If the same IP sample event exists in branch-misses and branch-instructions, this event uses the same symbol. When annotate branch-misses events, notes->src corresponding to this event is set to null, as a result, when annotate branch-instructions events, this event is skipped and no annotate is output. Solution of this patch is to remove zfree in hists__find_annotations and change sort order to "dso,symbol" to avoid duplicate output when different processes correspond to the same symbol. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> --- Changes since v6: - Remove separate setup_sorting() for branch mode. Changes since v5: - Add Signed-off-by tag. Changes since v4: - Use the same sort key "dso,symbol" in branch stack mode. Changes since v3: - Modify the first line of comments. Changes since v2: - Remove zfree in hists__find_annotations. - Change sort order to avoid duplicate output. Changes since v1: - Change processed flag variable from u8 to bool. tools/perf/builtin-annotate.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index a23ba6bb99b6..795c2ac7fcd1 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists *hists, } else { hist_entry__tty_annotate(he, evsel, ann); nd = rb_next(nd); - /* - * Since we have a hist_entry per IP for the same - * symbol, free he->ms.sym->src to signal we already - * processed this symbol. - */ - zfree(¬es->src->cycles_hist); - zfree(¬es->src); } } } @@ -619,14 +612,22 @@ int cmd_annotate(int argc, const char **argv) setup_browser(true); - if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) { + /* + * Events of different processes may correspond to the same + * symbol, we do not care about the processes in annotate, + * set sort order to avoid repeated output. + */ + sort_order = "dso,symbol"; + + /* + * Set SORT_MODE__BRANCH so that annotate display IPC/Cycle + * if branch info is in perf data in TUI mode. + */ + if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) sort__mode = SORT_MODE__BRANCH; - if (setup_sorting(annotate.session->evlist) < 0) - usage_with_options(annotate_usage, options); - } else { - if (setup_sorting(NULL) < 0) - usage_with_options(annotate_usage, options); - } + + if (setup_sorting(NULL) < 0) + usage_with_options(annotate_usage, options); ret = __cmd_annotate(&annotate); -- 2.30.GIT ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-03-19 12:35 [PATCH v7] perf annotate: Fix sample events lost in stdio mode Yang Jihong @ 2021-03-26 2:24 ` Yang Jihong 2021-03-26 3:25 ` Namhyung Kim 0 siblings, 1 reply; 10+ messages in thread From: Yang Jihong @ 2021-03-26 2:24 UTC (permalink / raw) To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung, yao.jin, gustavoars, mliska, linux-kernel Cc: zhangjinhao2 Hello, ping :) On 2021/3/19 20:35, Yang Jihong wrote: > In hist__find_annotations function, since different hist_entry may point to same > symbol, we free notes->src to signal already processed this symbol in stdio mode; > when annotate, entry will skipped if notes->src is NULL to avoid repeated output. > > However, there is a problem, for example, run the following command: > > # perf record -e branch-misses -e branch-instructions -a sleep 1 > > perf.data file contains different types of sample event. > > If the same IP sample event exists in branch-misses and branch-instructions, > this event uses the same symbol. When annotate branch-misses events, notes->src > corresponding to this event is set to null, as a result, when annotate > branch-instructions events, this event is skipped and no annotate is output. > > Solution of this patch is to remove zfree in hists__find_annotations and > change sort order to "dso,symbol" to avoid duplicate output when different > processes correspond to the same symbol. > > Signed-off-by: Yang Jihong <yangjihong1@huawei.com> > --- > > Changes since v6: > - Remove separate setup_sorting() for branch mode. > > Changes since v5: > - Add Signed-off-by tag. > > Changes since v4: > - Use the same sort key "dso,symbol" in branch stack mode. > > Changes since v3: > - Modify the first line of comments. > > Changes since v2: > - Remove zfree in hists__find_annotations. > - Change sort order to avoid duplicate output. > > Changes since v1: > - Change processed flag variable from u8 to bool. > > tools/perf/builtin-annotate.c | 29 +++++++++++++++-------------- > 1 file changed, 15 insertions(+), 14 deletions(-) > > diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c > index a23ba6bb99b6..795c2ac7fcd1 100644 > --- a/tools/perf/builtin-annotate.c > +++ b/tools/perf/builtin-annotate.c > @@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists *hists, > } else { > hist_entry__tty_annotate(he, evsel, ann); > nd = rb_next(nd); > - /* > - * Since we have a hist_entry per IP for the same > - * symbol, free he->ms.sym->src to signal we already > - * processed this symbol. > - */ > - zfree(¬es->src->cycles_hist); > - zfree(¬es->src); > } > } > } > @@ -619,14 +612,22 @@ int cmd_annotate(int argc, const char **argv) > > setup_browser(true); > > - if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) { > + /* > + * Events of different processes may correspond to the same > + * symbol, we do not care about the processes in annotate, > + * set sort order to avoid repeated output. > + */ > + sort_order = "dso,symbol"; > + > + /* > + * Set SORT_MODE__BRANCH so that annotate display IPC/Cycle > + * if branch info is in perf data in TUI mode. > + */ > + if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) > sort__mode = SORT_MODE__BRANCH; > - if (setup_sorting(annotate.session->evlist) < 0) > - usage_with_options(annotate_usage, options); > - } else { > - if (setup_sorting(NULL) < 0) > - usage_with_options(annotate_usage, options); > - } > + > + if (setup_sorting(NULL) < 0) > + usage_with_options(annotate_usage, options); > > ret = __cmd_annotate(&annotate); > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-03-26 2:24 ` Yang Jihong @ 2021-03-26 3:25 ` Namhyung Kim 2021-03-26 12:06 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 10+ messages in thread From: Namhyung Kim @ 2021-03-26 3:25 UTC (permalink / raw) To: Yang Jihong Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel, zhangjinhao2 Hello, On Fri, Mar 26, 2021 at 11:24 AM Yang Jihong <yangjihong1@huawei.com> wrote: > > Hello, > ping :) > > On 2021/3/19 20:35, Yang Jihong wrote: > > In hist__find_annotations function, since different hist_entry may point to same > > symbol, we free notes->src to signal already processed this symbol in stdio mode; > > when annotate, entry will skipped if notes->src is NULL to avoid repeated output. > > > > However, there is a problem, for example, run the following command: > > > > # perf record -e branch-misses -e branch-instructions -a sleep 1 > > > > perf.data file contains different types of sample event. > > > > If the same IP sample event exists in branch-misses and branch-instructions, > > this event uses the same symbol. When annotate branch-misses events, notes->src > > corresponding to this event is set to null, as a result, when annotate > > branch-instructions events, this event is skipped and no annotate is output. > > > > Solution of this patch is to remove zfree in hists__find_annotations and > > change sort order to "dso,symbol" to avoid duplicate output when different > > processes correspond to the same symbol. > > > > Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Thanks, Namhyung > > --- > > > > Changes since v6: > > - Remove separate setup_sorting() for branch mode. > > > > Changes since v5: > > - Add Signed-off-by tag. > > > > Changes since v4: > > - Use the same sort key "dso,symbol" in branch stack mode. > > > > Changes since v3: > > - Modify the first line of comments. > > > > Changes since v2: > > - Remove zfree in hists__find_annotations. > > - Change sort order to avoid duplicate output. > > > > Changes since v1: > > - Change processed flag variable from u8 to bool. > > > > tools/perf/builtin-annotate.c | 29 +++++++++++++++-------------- > > 1 file changed, 15 insertions(+), 14 deletions(-) > > > > diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c > > index a23ba6bb99b6..795c2ac7fcd1 100644 > > --- a/tools/perf/builtin-annotate.c > > +++ b/tools/perf/builtin-annotate.c > > @@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists *hists, > > } else { > > hist_entry__tty_annotate(he, evsel, ann); > > nd = rb_next(nd); > > - /* > > - * Since we have a hist_entry per IP for the same > > - * symbol, free he->ms.sym->src to signal we already > > - * processed this symbol. > > - */ > > - zfree(¬es->src->cycles_hist); > > - zfree(¬es->src); > > } > > } > > } > > @@ -619,14 +612,22 @@ int cmd_annotate(int argc, const char **argv) > > > > setup_browser(true); > > > > - if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) { > > + /* > > + * Events of different processes may correspond to the same > > + * symbol, we do not care about the processes in annotate, > > + * set sort order to avoid repeated output. > > + */ > > + sort_order = "dso,symbol"; > > + > > + /* > > + * Set SORT_MODE__BRANCH so that annotate display IPC/Cycle > > + * if branch info is in perf data in TUI mode. > > + */ > > + if ((use_browser == 1 || annotate.use_stdio2) && annotate.has_br_stack) > > sort__mode = SORT_MODE__BRANCH; > > - if (setup_sorting(annotate.session->evlist) < 0) > > - usage_with_options(annotate_usage, options); > > - } else { > > - if (setup_sorting(NULL) < 0) > > - usage_with_options(annotate_usage, options); > > - } > > + > > + if (setup_sorting(NULL) < 0) > > + usage_with_options(annotate_usage, options); > > > > ret = __cmd_annotate(&annotate); > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-03-26 3:25 ` Namhyung Kim @ 2021-03-26 12:06 ` Arnaldo Carvalho de Melo 2021-03-27 2:15 ` Yang Jihong 0 siblings, 1 reply; 10+ messages in thread From: Arnaldo Carvalho de Melo @ 2021-03-26 12:06 UTC (permalink / raw) To: Namhyung Kim, Yang Jihong Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel, zhangjinhao2 Em Fri, Mar 26, 2021 at 12:25:37PM +0900, Namhyung Kim escreveu: > On Fri, Mar 26, 2021 at 11:24 AM Yang Jihong <yangjihong1@huawei.com> wrote: > > On 2021/3/19 20:35, Yang Jihong wrote: > > > In hist__find_annotations function, since different hist_entry may point to same > > > symbol, we free notes->src to signal already processed this symbol in stdio mode; > > > when annotate, entry will skipped if notes->src is NULL to avoid repeated output. > > > > > > However, there is a problem, for example, run the following command: > > > > > > # perf record -e branch-misses -e branch-instructions -a sleep 1 > > > > > > perf.data file contains different types of sample event. > > > > > > If the same IP sample event exists in branch-misses and branch-instructions, > > > this event uses the same symbol. When annotate branch-misses events, notes->src > > > corresponding to this event is set to null, as a result, when annotate > > > branch-instructions events, this event is skipped and no annotate is output. > > > > > > Solution of this patch is to remove zfree in hists__find_annotations and > > > change sort order to "dso,symbol" to avoid duplicate output when different > > > processes correspond to the same symbol. > > > Signed-off-by: Yang Jihong <yangjihong1@huawei.com> > Acked-by: Namhyung Kim <namhyung@kernel.org> Without looking at the patch, just at its description of the problem, I tried to annotate two events in a group, to get the annotate group view output with both events, and it seems I'm getting samples accounted for both events: [root@five ~]# perf record -e '{branch-misses,branch-instructions}' -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.296 MB perf.data (2507 samples) ] [root@five ~]# [root@five ~]# perf report | grep -v '^#' | head -5 17.49% 19.19% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarking::Run 12.17% 17.04% ThreadPoolForeg chromium-browser [.] v8::internal::Sweeper::RawSweep 11.14% 11.63% ThreadPoolForeg chromium-browser [.] v8::internal::MarkingVisitorBase<v8::internal::ConcurrentMarkingVisitor, v8::internal::ConcurrentMarkingState>::ProcessStrongHeapObject<v8::internal::CompressedHeapObjectSlot> 7.65% 7.84% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarkingVisitor::ShouldVisit 5.66% 6.72% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarkingVisitor::VisitPointersInSnapshot [root@five ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run Samples: 444 of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period] v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser Percent Disassembly of section .text: 0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>: v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool): push %rbp mov %rsp,%rbp push %r15 push %r14 mov %rdi,%r14 push %r13 mov %edx,%r13d push %r12 mov %ecx,%r12d push %rbx sub $0x1298,%rsp mov %rsi,-0x1228(%rbp) mov %fs:0x28,%rax mov %rax,-0x38(%rbp) <SNIP> movzwl 0x2(%rbx),%eax test %ax,%ax ↓ jne 4a9 mov -0x10e8(%rbp),%rdx cmpw $0x0,0x2(%rdx) 0.41 0.39 ↓ je 4b90 movq %rbx,%xmm0 movq %rdx,%xmm2 mov %rdx,%rbx punpcklqdq %xmm2,%xmm0 movups %xmm0,-0x10e8(%rbp) movzwl 0x2(%rdx),%eax 4a9: sub $0x1,%eax mov %ax,0x2(%rbx) 0.36 0.91 movzwl %ax,%eax 0.60 0.00 mov 0x10(%rbx,%rax,8),%rax 3.44 2.46 mov %rax,-0x11e0(%rbp) 0.00 0.34 4bf: mov 0x8(%r13),%rax 0.00 0.36 add $0x1,%r15d 0.00 0.34 mov 0x110(%rax),%rax mov 0x128(%rax),%rcx 0.88 0.36 mov 0x8(%r13),%rax mov 0x110(%rax),%rdx mov 0x130(%rdx),%rdx 0.00 0.48 mov 0x140(%rax),%rax mov 0x110(%rax),%rsi 0.61 0.47 mov -0x11e0(%rbp),%rax 2.01 2.32 sub $0x1,%rax cmp %rcx,%rax 0.00 0.35 setae %cl 1.31 0.33 cmp %rdx,%rax setb %dl 0.00 0.24 test %dl,%cl 0.00 0.12 ↓ jne 4b70 cmp %rsi,%rax ↓ je 4b70 mov (%rax),%eax 29.10 29.90 add -0x1220(%rbp),%rax cmpb $0x0,-0x1218(%rbp) mov %rax,-0x1210(%rbp) 0.00 0.65 ↓ jne 4fa0 mov -0x11e0(%rbp),%r9 lea 0x6(%rax),%rbx 0.38 0.00 545: movzbl (%rbx),%eax 4.90 5.34 cmp $0x4c,%al ↓ ja 5026 0.58 0.00 550: lea v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi movslq (%rdi,%rax,4),%rax 4.97 3.19 add %rdi,%rax <SNIP> If I ask for number of samples: [root@five ~]# perf config annotate.show_nr_samples=true [root@five ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run Samples: 444 of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period] v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser Samples Disassembly of section .text: 0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>: v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool): push %rbp mov %rsp,%rbp push %r15 push %r14 mov %rdi,%r14 push %r13 mov %edx,%r13d push %r12 mov %ecx,%r12d push %rbx sub $0x1298,%rsp mov %rsi,-0x1228(%rbp) mov %fs:0x28,%rax mov %rax,-0x38(%rbp) <SNIP> movzwl 0x2(%rbx),%eax test %ax,%ax ↓ jne 4a9 mov -0x10e8(%rbp),%rdx cmpw $0x0,0x2(%rdx) 1 1 ↓ je 4b90 movq %rbx,%xmm0 movq %rdx,%xmm2 mov %rdx,%rbx punpcklqdq %xmm2,%xmm0 movups %xmm0,-0x10e8(%rbp) movzwl 0x2(%rdx),%eax 4a9: sub $0x1,%eax mov %ax,0x2(%rbx) 1 2 movzwl %ax,%eax 1 0 mov 0x10(%rbx,%rax,8),%rax 8 5 mov %rax,-0x11e0(%rbp) 0 1 4bf: mov 0x8(%r13),%rax 0 1 add $0x1,%r15d 0 1 mov 0x110(%rax),%rax mov 0x128(%rax),%rcx 3 1 mov 0x8(%r13),%rax mov 0x110(%rax),%rdx mov 0x130(%rdx),%rdx 0 1 mov 0x140(%rax),%rax mov 0x110(%rax),%rsi 2 1 mov -0x11e0(%rbp),%rax 6 6 sub $0x1,%rax cmp %rcx,%rax 0 1 setae %cl 2 1 cmp %rdx,%rax setb %dl 0 1 test %dl,%cl 0 1 ↓ jne 4b70 cmp %rsi,%rax ↓ je 4b70 mov (%rax),%eax 58 73 add -0x1220(%rbp),%rax cmpb $0x0,-0x1218(%rbp) mov %rax,-0x1210(%rbp) 0 1 ↓ jne 4fa0 mov -0x11e0(%rbp),%r9 lea 0x6(%rax),%rbx 1 0 545: movzbl (%rbx),%eax 10 13 cmp $0x4c,%al ↓ ja 5026 1 0 550: lea v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi movslq (%rdi,%rax,4),%rax 9 8 add %rdi,%rax <SNIP> So it seems to be working, what am I missing? Is this strictly non group related? - Arnaldo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-03-26 12:06 ` Arnaldo Carvalho de Melo @ 2021-03-27 2:15 ` Yang Jihong 2021-03-30 7:26 ` Namhyung Kim 0 siblings, 1 reply; 10+ messages in thread From: Yang Jihong @ 2021-03-27 2:15 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Namhyung Kim Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel, zhangjinhao2 Hello, On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote: > Em Fri, Mar 26, 2021 at 12:25:37PM +0900, Namhyung Kim escreveu: >> On Fri, Mar 26, 2021 at 11:24 AM Yang Jihong <yangjihong1@huawei.com> wrote: >>> On 2021/3/19 20:35, Yang Jihong wrote: >>>> In hist__find_annotations function, since different hist_entry may point to same >>>> symbol, we free notes->src to signal already processed this symbol in stdio mode; >>>> when annotate, entry will skipped if notes->src is NULL to avoid repeated output. >>>> >>>> However, there is a problem, for example, run the following command: >>>> >>>> # perf record -e branch-misses -e branch-instructions -a sleep 1 >>>> >>>> perf.data file contains different types of sample event. >>>> >>>> If the same IP sample event exists in branch-misses and branch-instructions, >>>> this event uses the same symbol. When annotate branch-misses events, notes->src >>>> corresponding to this event is set to null, as a result, when annotate >>>> branch-instructions events, this event is skipped and no annotate is output. >>>> >>>> Solution of this patch is to remove zfree in hists__find_annotations and >>>> change sort order to "dso,symbol" to avoid duplicate output when different >>>> processes correspond to the same symbol. > >>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> > >> Acked-by: Namhyung Kim <namhyung@kernel.org> > > Without looking at the patch, just at its description of the problem, I > tried to annotate two events in a group, to get the annotate group view > output with both events, and it seems I'm getting samples accounted for > both events: > > [root@five ~]# perf record -e '{branch-misses,branch-instructions}' -a sleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 2.296 MB perf.data (2507 samples) ] > [root@five ~]# > [root@five ~]# perf report | grep -v '^#' | head -5 > 17.49% 19.19% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarking::Run > 12.17% 17.04% ThreadPoolForeg chromium-browser [.] v8::internal::Sweeper::RawSweep > 11.14% 11.63% ThreadPoolForeg chromium-browser [.] v8::internal::MarkingVisitorBase<v8::internal::ConcurrentMarkingVisitor, v8::internal::ConcurrentMarkingState>::ProcessStrongHeapObject<v8::internal::CompressedHeapObjectSlot> > 7.65% 7.84% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarkingVisitor::ShouldVisit > 5.66% 6.72% ThreadPoolForeg chromium-browser [.] v8::internal::ConcurrentMarkingVisitor::VisitPointersInSnapshot > > [root@five ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run > Samples: 444 of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period] > v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser > Percent > > > Disassembly of section .text: > > 0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>: > v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool): > push %rbp > mov %rsp,%rbp > push %r15 > push %r14 > mov %rdi,%r14 > push %r13 > mov %edx,%r13d > push %r12 > mov %ecx,%r12d > push %rbx > sub $0x1298,%rsp > mov %rsi,-0x1228(%rbp) > mov %fs:0x28,%rax > mov %rax,-0x38(%rbp) > <SNIP> > movzwl 0x2(%rbx),%eax > test %ax,%ax > ↓ jne 4a9 > mov -0x10e8(%rbp),%rdx > cmpw $0x0,0x2(%rdx) > 0.41 0.39 ↓ je 4b90 > movq %rbx,%xmm0 > movq %rdx,%xmm2 > mov %rdx,%rbx > punpcklqdq %xmm2,%xmm0 > movups %xmm0,-0x10e8(%rbp) > movzwl 0x2(%rdx),%eax > 4a9: sub $0x1,%eax > mov %ax,0x2(%rbx) > 0.36 0.91 movzwl %ax,%eax > 0.60 0.00 mov 0x10(%rbx,%rax,8),%rax > 3.44 2.46 mov %rax,-0x11e0(%rbp) > 0.00 0.34 4bf: mov 0x8(%r13),%rax > 0.00 0.36 add $0x1,%r15d > 0.00 0.34 mov 0x110(%rax),%rax > mov 0x128(%rax),%rcx > 0.88 0.36 mov 0x8(%r13),%rax > mov 0x110(%rax),%rdx > mov 0x130(%rdx),%rdx > 0.00 0.48 mov 0x140(%rax),%rax > mov 0x110(%rax),%rsi > 0.61 0.47 mov -0x11e0(%rbp),%rax > 2.01 2.32 sub $0x1,%rax > cmp %rcx,%rax > 0.00 0.35 setae %cl > 1.31 0.33 cmp %rdx,%rax > setb %dl > 0.00 0.24 test %dl,%cl > 0.00 0.12 ↓ jne 4b70 > cmp %rsi,%rax > ↓ je 4b70 > mov (%rax),%eax > 29.10 29.90 add -0x1220(%rbp),%rax > cmpb $0x0,-0x1218(%rbp) > mov %rax,-0x1210(%rbp) > 0.00 0.65 ↓ jne 4fa0 > mov -0x11e0(%rbp),%r9 > lea 0x6(%rax),%rbx > 0.38 0.00 545: movzbl (%rbx),%eax > 4.90 5.34 cmp $0x4c,%al > ↓ ja 5026 > 0.58 0.00 550: lea v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi > movslq (%rdi,%rax,4),%rax > 4.97 3.19 add %rdi,%rax > <SNIP> > > If I ask for number of samples: > > [root@five ~]# perf config annotate.show_nr_samples=true > [root@five ~]# perf annotate --stdio2 v8::internal::ConcurrentMarking::Run > > Samples: 444 of events 'anon group { branch-misses, branch-instructions }', 4000 Hz, Event count (approx.): 596221, [percent: local period] > v8::internal::ConcurrentMarking::Run() /usr/lib64/chromium-browser/chromium-browser > Samples > > > Disassembly of section .text: > > 0000000003290b30 <v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool)>: > v8::internal::ConcurrentMarking::Run(v8::JobDelegate*, unsigned int, bool): > push %rbp > mov %rsp,%rbp > push %r15 > push %r14 > mov %rdi,%r14 > push %r13 > mov %edx,%r13d > push %r12 > mov %ecx,%r12d > push %rbx > sub $0x1298,%rsp > mov %rsi,-0x1228(%rbp) > mov %fs:0x28,%rax > mov %rax,-0x38(%rbp) > <SNIP> > movzwl 0x2(%rbx),%eax > test %ax,%ax > ↓ jne 4a9 > mov -0x10e8(%rbp),%rdx > cmpw $0x0,0x2(%rdx) > 1 1 ↓ je 4b90 > movq %rbx,%xmm0 > movq %rdx,%xmm2 > mov %rdx,%rbx > punpcklqdq %xmm2,%xmm0 > movups %xmm0,-0x10e8(%rbp) > movzwl 0x2(%rdx),%eax > 4a9: sub $0x1,%eax > mov %ax,0x2(%rbx) > 1 2 movzwl %ax,%eax > 1 0 mov 0x10(%rbx,%rax,8),%rax > 8 5 mov %rax,-0x11e0(%rbp) > 0 1 4bf: mov 0x8(%r13),%rax > 0 1 add $0x1,%r15d > 0 1 mov 0x110(%rax),%rax > mov 0x128(%rax),%rcx > 3 1 mov 0x8(%r13),%rax > mov 0x110(%rax),%rdx > mov 0x130(%rdx),%rdx > 0 1 mov 0x140(%rax),%rax > mov 0x110(%rax),%rsi > 2 1 mov -0x11e0(%rbp),%rax > 6 6 sub $0x1,%rax > cmp %rcx,%rax > 0 1 setae %cl > 2 1 cmp %rdx,%rax > setb %dl > 0 1 test %dl,%cl > 0 1 ↓ jne 4b70 > cmp %rsi,%rax > ↓ je 4b70 > mov (%rax),%eax > 58 73 add -0x1220(%rbp),%rax > cmpb $0x0,-0x1218(%rbp) > mov %rax,-0x1210(%rbp) > 0 1 ↓ jne 4fa0 > mov -0x11e0(%rbp),%r9 > lea 0x6(%rax),%rbx > 1 0 545: movzbl (%rbx),%eax > 10 13 cmp $0x4c,%al > ↓ ja 5026 > 1 0 550: lea v8::internal::FLAGDEFAULT_abort_on_contradictory_flags+0x457,%rdi > movslq (%rdi,%rax,4),%rax > 9 8 add %rdi,%rax > > <SNIP> > > So it seems to be working, what am I missing? Is this strictly non > group related? > Yes, it is non group related. This problem occurs only when different events need to be recorded at the same time, i.e.: perf record -e branch-misses -e branch-instructions -a sleep 1 The output results of perf script and perf annotate do not match. Some events are not output in perf annotate. > - Arnaldo > . > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-03-27 2:15 ` Yang Jihong @ 2021-03-30 7:26 ` Namhyung Kim 2021-03-31 2:18 ` Yang Jihong 0 siblings, 1 reply; 10+ messages in thread From: Namhyung Kim @ 2021-03-30 7:26 UTC (permalink / raw) To: Yang Jihong Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel, zhangjinhao2 Hi Yang and Arnaldo, On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> wrote: > On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote: > > So it seems to be working, what am I missing? Is this strictly non > > group related? > > > Yes, it is non group related. > This problem occurs only when different events need to be recorded at > the same time, i.e.: > perf record -e branch-misses -e branch-instructions -a sleep 1 > > The output results of perf script and perf annotate do not match. > Some events are not output in perf annotate. Yeah I think it's related to sort keys. The code works with a single hist_entry for each event and symbol. But the default sort key creates multiple entries for different threads and it causes the confusion. Thanks, Namhyung ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-03-30 7:26 ` Namhyung Kim @ 2021-03-31 2:18 ` Yang Jihong 2021-04-12 7:22 ` Yang Jihong 0 siblings, 1 reply; 10+ messages in thread From: Yang Jihong @ 2021-03-31 2:18 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel, zhangjinhao2 Hello, Namhyung and Arnaldo, On 2021/3/30 15:26, Namhyung Kim wrote: > Hi Yang and Arnaldo, > > On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> wrote: >> On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote: >>> So it seems to be working, what am I missing? Is this strictly non >>> group related? >>> >> Yes, it is non group related. >> This problem occurs only when different events need to be recorded at >> the same time, i.e.: >> perf record -e branch-misses -e branch-instructions -a sleep 1 >> >> The output results of perf script and perf annotate do not match. >> Some events are not output in perf annotate. > > Yeah I think it's related to sort keys. The code works with a single > hist_entry for each event and symbol. But the default sort key > creates multiple entries for different threads and it causes the > confusion. > Yes, After revome zfree from hists__find_annotations, the output of perf annotate is repeated, which is related to sort keys. The original problem is that notes->src may correspond to multiple sample events. Therefore, we cannot simply zfree notes->src to avoid repeated output. Arnaldo, is there any problem with this patch? :) Thanks, Yang > Thanks, > Namhyung > . > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-03-31 2:18 ` Yang Jihong @ 2021-04-12 7:22 ` Yang Jihong 2021-04-14 12:23 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 10+ messages in thread From: Yang Jihong @ 2021-04-12 7:22 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel, zhangjinhao2, Namhyung Kim Hello, Arnaldo, On 2021/3/31 10:18, Yang Jihong wrote: > Hello, Namhyung and Arnaldo, > > On 2021/3/30 15:26, Namhyung Kim wrote: >> Hi Yang and Arnaldo, >> >> On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> >> wrote: >>> On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote: >>>> So it seems to be working, what am I missing? Is this strictly non >>>> group related? >>>> >>> Yes, it is non group related. >>> This problem occurs only when different events need to be recorded at >>> the same time, i.e.: >>> perf record -e branch-misses -e branch-instructions -a sleep 1 >>> >>> The output results of perf script and perf annotate do not match. >>> Some events are not output in perf annotate. >> >> Yeah I think it's related to sort keys. The code works with a single >> hist_entry for each event and symbol. But the default sort key >> creates multiple entries for different threads and it causes the >> confusion. >> > Yes, After revome zfree from hists__find_annotations, the output of perf > annotate is repeated, which is related to sort keys. > > The original problem is that notes->src may correspond to multiple > sample events. Therefore, we cannot simply zfree notes->src to avoid > repeated output. > > Arnaldo, is there any problem with this patch? :) > > Thanks, > Yang >> Thanks, >> Namhyung >> . >> PING :) Is there any problem with this patch that needs to be modified? Thanks, Yang ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-04-12 7:22 ` Yang Jihong @ 2021-04-14 12:23 ` Arnaldo Carvalho de Melo 2021-04-14 13:15 ` Namhyung Kim 0 siblings, 1 reply; 10+ messages in thread From: Arnaldo Carvalho de Melo @ 2021-04-14 12:23 UTC (permalink / raw) To: Yang Jihong Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel, zhangjinhao2, Namhyung Kim Em Mon, Apr 12, 2021 at 03:22:29PM +0800, Yang Jihong escreveu: > On 2021/3/31 10:18, Yang Jihong wrote: > > On 2021/3/30 15:26, Namhyung Kim wrote: > > > On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> wrote: > > > > On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote: > > > > > So it seems to be working, what am I missing? Is this strictly non > > > > > group related? > > > > Yes, it is non group related. > > > > This problem occurs only when different events need to be recorded at > > > > the same time, i.e.: > > > > perf record -e branch-misses -e branch-instructions -a sleep 1 > > > > The output results of perf script and perf annotate do not match. > > > > Some events are not output in perf annotate. > > > Yeah I think it's related to sort keys. The code works with a single > > > hist_entry for each event and symbol. But the default sort key > > > creates multiple entries for different threads and it causes the > > > confusion. > > Yes, After revome zfree from hists__find_annotations, the output of perf > > annotate is repeated, which is related to sort keys. > > The original problem is that notes->src may correspond to multiple > > sample events. Therefore, we cannot simply zfree notes->src to avoid > > repeated output. > > Arnaldo, is there any problem with this patch? :) > PING :) > Is there any problem with this patch that needs to be modified? I continue having a feeling this is kinda a bandaid, i.e. avoid the problem, and since we have a way to work this when using a group, I fail to see why it couldn't work when not grouping events. But since I have no time to dive into this and Namhyung is ok with it, I'll merge it now. - Arnaldo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v7] perf annotate: Fix sample events lost in stdio mode 2021-04-14 12:23 ` Arnaldo Carvalho de Melo @ 2021-04-14 13:15 ` Namhyung Kim 0 siblings, 0 replies; 10+ messages in thread From: Namhyung Kim @ 2021-04-14 13:15 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Yang Jihong, Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Yao Jin, gustavoars, mliska, linux-kernel, zhangjinhao2 Hi Arnaldo, On Wed, Apr 14, 2021 at 9:23 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Mon, Apr 12, 2021 at 03:22:29PM +0800, Yang Jihong escreveu: > > On 2021/3/31 10:18, Yang Jihong wrote: > > > On 2021/3/30 15:26, Namhyung Kim wrote: > > > > On Sat, Mar 27, 2021 at 11:16 AM Yang Jihong <yangjihong1@huawei.com> wrote: > > > > > On 2021/3/26 20:06, Arnaldo Carvalho de Melo wrote: > > > > > > So it seems to be working, what am I missing? Is this strictly non > > > > > > group related? > > > > > > Yes, it is non group related. > > > > > This problem occurs only when different events need to be recorded at > > > > > the same time, i.e.: > > > > > perf record -e branch-misses -e branch-instructions -a sleep 1 > > > > > > The output results of perf script and perf annotate do not match. > > > > > Some events are not output in perf annotate. > > > > > Yeah I think it's related to sort keys. The code works with a single > > > > hist_entry for each event and symbol. But the default sort key > > > > creates multiple entries for different threads and it causes the > > > > confusion. > > > > Yes, After revome zfree from hists__find_annotations, the output of perf > > > annotate is repeated, which is related to sort keys. > > > > The original problem is that notes->src may correspond to multiple > > > sample events. Therefore, we cannot simply zfree notes->src to avoid > > > repeated output. > > > > Arnaldo, is there any problem with this patch? :) > > > PING :) > > Is there any problem with this patch that needs to be modified? > > I continue having a feeling this is kinda a bandaid, i.e. avoid the > problem, and since we have a way to work this when using a group, I fail > to see why it couldn't work when not grouping events. When we use a group, there's a single iteration only for the leader event. But if not, it'll iterate the hist entries twice (for two events). Each iteration used to have multiple entries for the same symbol (due to the sort key), so it marked the symbol (by freeing notes->src) to skip the same symbol during the iteration. However as the first iteration freed sym->notes->src, then the second (or later) event cannot see the annotation for the deleted symbols for that event even if it has some samples. > > But since I have no time to dive into this and Namhyung is ok with it, > I'll merge it now. Thanks, Namhyung ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-04-14 13:15 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-03-19 12:35 [PATCH v7] perf annotate: Fix sample events lost in stdio mode Yang Jihong 2021-03-26 2:24 ` Yang Jihong 2021-03-26 3:25 ` Namhyung Kim 2021-03-26 12:06 ` Arnaldo Carvalho de Melo 2021-03-27 2:15 ` Yang Jihong 2021-03-30 7:26 ` Namhyung Kim 2021-03-31 2:18 ` Yang Jihong 2021-04-12 7:22 ` Yang Jihong 2021-04-14 12:23 ` Arnaldo Carvalho de Melo 2021-04-14 13:15 ` Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox