* "perf top -g" leaking ~300MB per second. @ 2014-12-13 8:48 Markus Trippelsdorf 2014-12-13 9:03 ` Markus Trippelsdorf 0 siblings, 1 reply; 9+ messages in thread From: Markus Trippelsdorf @ 2014-12-13 8:48 UTC (permalink / raw) To: linux-kernel Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo Running "perf top -g" built from current Linus tree apparently leaks ~300MB of memory every second an my machine. -- Markus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second. 2014-12-13 8:48 "perf top -g" leaking ~300MB per second Markus Trippelsdorf @ 2014-12-13 9:03 ` Markus Trippelsdorf 2014-12-13 15:26 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 9+ messages in thread From: Markus Trippelsdorf @ 2014-12-13 9:03 UTC (permalink / raw) To: linux-kernel Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo On 2014.12.13 at 09:48 +0100, Markus Trippelsdorf wrote: > Running "perf top -g" built from current Linus tree apparently leaks > ~300MB of memory every second an my machine. Hmm, this is a much older problem. I just noticed this the first time today. To reproduce: Compile some application in the background (make -j4 in my case) and run "perf top -g". Perf will continue to accumulate memory until the system starts to swap and the OOM killer eventually kicks in. -- Markus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second. 2014-12-13 9:03 ` Markus Trippelsdorf @ 2014-12-13 15:26 ` Arnaldo Carvalho de Melo 2014-12-13 18:16 ` David Ahern 0 siblings, 1 reply; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2014-12-13 15:26 UTC (permalink / raw) To: Markus Trippelsdorf Cc: linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar Em Sat, Dec 13, 2014 at 10:03:31AM +0100, Markus Trippelsdorf escreveu: > On 2014.12.13 at 09:48 +0100, Markus Trippelsdorf wrote: > > Running "perf top -g" built from current Linus tree apparently leaks > > ~300MB of memory every second an my machine. > > Hmm, this is a much older problem. I just noticed this the first time > today. > To reproduce: Compile some application in the background (make -j4 in my > case) and run "perf top -g". Perf will continue to accumulate memory > until the system starts to swap and the OOM killer eventually kicks in. Yeap, longstanding problem, try minimizing the problem using a lower frequency. The callchain code was done initially for 'report' and when I made 'top' reuse the hist_entry code allowing 'top' to collect callchains was too easy, but then we need to go thru the callchain/hists/hist_entry code to make sure that they don't leak, will try to do it... - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second. 2014-12-13 15:26 ` Arnaldo Carvalho de Melo @ 2014-12-13 18:16 ` David Ahern 2014-12-30 5:38 ` Namhyung Kim 0 siblings, 1 reply; 9+ messages in thread From: David Ahern @ 2014-12-13 18:16 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Markus Trippelsdorf Cc: linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote: > The callchain code was done initially for 'report' and when I made 'top' > reuse the hist_entry code allowing 'top' to collect callchains was too > easy, but then we need to go thru the callchain/hists/hist_entry code to > make sure that they don't leak, will try to do it... > As I recall it is build up of the dead_threads list. David ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second. 2014-12-13 18:16 ` David Ahern @ 2014-12-30 5:38 ` Namhyung Kim 2014-12-30 8:35 ` Markus Trippelsdorf ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Namhyung Kim @ 2014-12-30 5:38 UTC (permalink / raw) To: David Ahern Cc: Arnaldo Carvalho de Melo, Markus Trippelsdorf, linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar Hi David and Markus, On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote: > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote: > >The callchain code was done initially for 'report' and when I made 'top' > >reuse the hist_entry code allowing 'top' to collect callchains was too > >easy, but then we need to go thru the callchain/hists/hist_entry code to > >make sure that they don't leak, will try to do it... > > > > As I recall it is build up of the dead_threads list. Maybe. But I guess it's because of leak of callchains.. Markus, could you please test below patch how much it affects? >From b29ccd79727654653986ab1170e0b1f5d6518035 Mon Sep 17 00:00:00 2001 From: Namhyung Kim <namhyung@kernel.org> Date: Tue, 30 Dec 2014 14:28:45 +0900 Subject: [PATCH] perf callchain: Free callchains when hist entries are deleted Markus reported that "perf top -g" can leak ~300MB per second on his machine. This is partly because it missed to free callchains when hist entries are deleted. Fix it. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++++++ tools/perf/util/callchain.h | 2 ++ tools/perf/util/hist.c | 1 + 3 files changed, 33 insertions(+) diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index 64b377e591e4..14e7a123d43b 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -841,3 +841,33 @@ char *callchain_list__sym_name(struct callchain_list *cl, return bf; } + +static void free_callchain_node(struct callchain_node *node) +{ + struct callchain_list *list, *tmp; + struct callchain_node *child; + struct rb_node *n; + + list_for_each_entry_safe(list, tmp, &node->val, list) { + list_del(&list->list); + free(list); + } + + n = rb_first(&node->rb_root_in); + while (n) { + child = container_of(n, struct callchain_node, rb_node_in); + n = rb_next(n); + rb_erase(&child->rb_node_in, &node->rb_root_in); + + free_callchain_node(child); + free(child); + } +} + +void free_callchain(struct callchain_root *root) +{ + if (!symbol_conf.use_callchain) + return; + + free_callchain_node(&root->node); +} diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h index dbc08cf5f970..c0ec1acc38e4 100644 --- a/tools/perf/util/callchain.h +++ b/tools/perf/util/callchain.h @@ -198,4 +198,6 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused, char *callchain_list__sym_name(struct callchain_list *cl, char *bf, size_t bfsize, bool show_dso); +void free_callchain(struct callchain_root *root); + #endif /* __PERF_CALLCHAIN_H */ diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index 30ff2cb92884..e17163fcb702 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -945,6 +945,7 @@ void hist_entry__delete(struct hist_entry *he) zfree(&he->mem_info); zfree(&he->stat_acc); free_srcline(he->srcline); + free_callchain(he->callchain); free(he); } -- 2.1.3 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second. 2014-12-30 5:38 ` Namhyung Kim @ 2014-12-30 8:35 ` Markus Trippelsdorf 2014-12-30 23:04 ` Arnaldo Carvalho de Melo 2015-01-08 15:09 ` Arnaldo Carvalho de Melo 2015-01-09 12:31 ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim 2 siblings, 1 reply; 9+ messages in thread From: Markus Trippelsdorf @ 2014-12-30 8:35 UTC (permalink / raw) To: Namhyung Kim Cc: David Ahern, Arnaldo Carvalho de Melo, linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar On 2014.12.30 at 14:38 +0900, Namhyung Kim wrote: > Hi David and Markus, > > On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote: > > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote: > > >The callchain code was done initially for 'report' and when I made 'top' > > >reuse the hist_entry code allowing 'top' to collect callchains was too > > >easy, but then we need to go thru the callchain/hists/hist_entry code to > > >make sure that they don't leak, will try to do it... > > > > > > > As I recall it is build up of the dead_threads list. > > Maybe. But I guess it's because of leak of callchains.. > > Markus, could you please test below patch how much it affects? Thanks Namhyung. It leaks an order of magnitude less memory now: ~30MB/sec on my machine. Valgrind shows (last entries of the list): ... ==20512== 7,225,920 bytes in 17,370 blocks are possibly lost in loss record 295 of 301 ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) ==20512== by 0x4A6996: zalloc (util.h:189) ==20512== by 0x4A6996: hist_entry__new (hist.c:309) ==20512== by 0x4A8249: add_hist_entry (hist.c:431) ==20512== by 0x4A8249: __hists__add_entry (hist.c:477) ==20512== by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730) ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876) ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) ==20512== by 0x4395EE: perf_top__mmap_read (builtin-top.c:871) ==20512== by 0x4395EE: __cmd_top (builtin-top.c:974) ==20512== by 0x4395EE: cmd_top (builtin-top.c:1266) ==20512== by 0x41B702: run_builtin (perf.c:341) ==20512== by 0x41AE51: handle_internal_command (perf.c:400) ==20512== by 0x41AE51: run_argv (perf.c:444) ==20512== by 0x41AE51: main (perf.c:559) ==20512== ==20512== 8,922,480 bytes in 159,330 blocks are possibly lost in loss record 296 of 301 ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) ==20512== by 0x4821FE: zalloc (util.h:189) ==20512== by 0x4821FE: fill_node (callchain.c:450) ==20512== by 0x4821FE: add_child (callchain.c:473) ==20512== by 0x4821FE: append_chain_children (callchain.c:596) ==20512== by 0x48514E: callchain_append (callchain.c:672) ==20512== by 0x4A8947: iter_add_single_cumulative_entry (hist.c:739) ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876) ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) ==20512== by 0x41B702: run_builtin (perf.c:341) ==20512== by 0x41AE51: handle_internal_command (perf.c:400) ==20512== by 0x41AE51: run_argv (perf.c:444) ==20512== by 0x41AE51: main (perf.c:559) ==20512== ==20512== 11,050,136 bytes in 1,663 blocks are definitely lost in loss record 297 of 301 ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) ==20512== by 0x44F7BF: zalloc (util.h:189) ==20512== by 0x44F7BF: symbol__alloc_hist (annotate.c:455) ==20512== by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507) ==20512== by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521) ==20512== by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195) ==20512== by 0x437A65: hist_iter__top_callback (builtin-top.c:688) ==20512== by 0x4A8AC8: hist_entry_iter__add (hist.c:892) ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) ==20512== by 0x41B702: run_builtin (perf.c:341) ==20512== by 0x41AE51: handle_internal_command (perf.c:400) ==20512== by 0x41AE51: run_argv (perf.c:444) ==20512== by 0x41AE51: main (perf.c:559) ==20512== ==20512== 24,920,064 bytes in 59,904 blocks are possibly lost in loss record 298 of 301 ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) ==20512== by 0x4A6996: zalloc (util.h:189) ==20512== by 0x4A6996: hist_entry__new (hist.c:309) ==20512== by 0x4A8249: add_hist_entry (hist.c:431) ==20512== by 0x4A8249: __hists__add_entry (hist.c:477) ==20512== by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730) ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876) ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) ==20512== by 0x41B702: run_builtin (perf.c:341) ==20512== by 0x41AE51: handle_internal_command (perf.c:400) ==20512== by 0x41AE51: run_argv (perf.c:444) ==20512== by 0x41AE51: main (perf.c:559) ==20512== ==20512== 26,147,680 bytes in 62,855 blocks are possibly lost in loss record 299 of 301 ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) ==20512== by 0x4A6996: zalloc (util.h:189) ==20512== by 0x4A6996: hist_entry__new (hist.c:309) ==20512== by 0x4A8249: add_hist_entry (hist.c:431) ==20512== by 0x4A8249: __hists__add_entry (hist.c:477) ==20512== by 0x4A8728: iter_add_next_cumulative_entry (hist.c:803) ==20512== by 0x4A8AA5: hist_entry_iter__add (hist.c:887) ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) ==20512== by 0x41B702: run_builtin (perf.c:341) ==20512== by 0x41AE51: handle_internal_command (perf.c:400) ==20512== by 0x41AE51: run_argv (perf.c:444) ==20512== by 0x41AE51: main (perf.c:559) ==20512== ==20512== 44,939,792 bytes in 3,788 blocks are definitely lost in loss record 300 of 301 ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) ==20512== by 0x44F7BF: zalloc (util.h:189) ==20512== by 0x44F7BF: symbol__alloc_hist (annotate.c:455) ==20512== by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507) ==20512== by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521) ==20512== by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195) ==20512== by 0x437A65: hist_iter__top_callback (builtin-top.c:688) ==20512== by 0x4A8AC8: hist_entry_iter__add (hist.c:892) ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) ==20512== by 0x4395EE: perf_top__mmap_read (builtin-top.c:871) ==20512== by 0x4395EE: __cmd_top (builtin-top.c:974) ==20512== by 0x4395EE: cmd_top (builtin-top.c:1266) ==20512== by 0x41B702: run_builtin (perf.c:341) ==20512== by 0x41AE51: handle_internal_command (perf.c:400) ==20512== by 0x41AE51: run_argv (perf.c:444) ==20512== by 0x41AE51: main (perf.c:559) ==20512== ==20512== 81,248,440 bytes in 1,450,865 blocks are possibly lost in loss record 301 of 301 ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) ==20512== by 0x4821FE: zalloc (util.h:189) ==20512== by 0x4821FE: fill_node (callchain.c:450) ==20512== by 0x4821FE: add_child (callchain.c:473) ==20512== by 0x4821FE: append_chain_children (callchain.c:596) ==20512== by 0x48514E: callchain_append (callchain.c:672) ==20512== by 0x4A8768: iter_add_next_cumulative_entry (hist.c:812) ==20512== by 0x4A8AA5: hist_entry_iter__add (hist.c:887) ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) ==20512== by 0x41B702: run_builtin (perf.c:341) ==20512== by 0x41AE51: handle_internal_command (perf.c:400) ==20512== by 0x41AE51: run_argv (perf.c:444) ==20512== by 0x41AE51: main (perf.c:559) ==20512== ==20512== LEAK SUMMARY: ==20512== definitely lost: 56,810,427 bytes in 5,479 blocks ==20512== indirectly lost: 4,000 bytes in 125 blocks ==20512== possibly lost: 210,493,048 bytes in 2,552,578 blocks ==20512== still reachable: 18,258 bytes in 242 blocks ==20512== suppressed: 0 bytes in 0 blocks -- Markus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second. 2014-12-30 8:35 ` Markus Trippelsdorf @ 2014-12-30 23:04 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2014-12-30 23:04 UTC (permalink / raw) To: Markus Trippelsdorf Cc: Namhyung Kim, David Ahern, linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar Em Tue, Dec 30, 2014 at 09:35:24AM +0100, Markus Trippelsdorf escreveu: > On 2014.12.30 at 14:38 +0900, Namhyung Kim wrote: > > Hi David and Markus, > > > > On Sat, Dec 13, 2014 at 11:16:43AM -0700, David Ahern wrote: > > > On 12/13/14 8:26 AM, Arnaldo Carvalho de Melo wrote: > > > >The callchain code was done initially for 'report' and when I made 'top' > > > >reuse the hist_entry code allowing 'top' to collect callchains was too > > > >easy, but then we need to go thru the callchain/hists/hist_entry code to > > > >make sure that they don't leak, will try to do it... > > > > > > > > > > As I recall it is build up of the dead_threads list. > > > > Maybe. But I guess it's because of leak of callchains.. > > > > Markus, could you please test below patch how much it affects? > > Thanks Namhyung. It leaks an order of magnitude less memory now: > ~30MB/sec on my machine. > > Valgrind shows (last entries of the list): > ... > ==20512== 7,225,920 bytes in 17,370 blocks are possibly lost in loss record 295 of 301 > ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) > ==20512== by 0x4A6996: zalloc (util.h:189) > ==20512== by 0x4A6996: hist_entry__new (hist.c:309) > ==20512== by 0x4A8249: add_hist_entry (hist.c:431) > ==20512== by 0x4A8249: __hists__add_entry (hist.c:477) > ==20512== by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730) > ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876) > ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) > ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) > ==20512== by 0x4395EE: perf_top__mmap_read (builtin-top.c:871) > ==20512== by 0x4395EE: __cmd_top (builtin-top.c:974) > ==20512== by 0x4395EE: cmd_top (builtin-top.c:1266) > ==20512== by 0x41B702: run_builtin (perf.c:341) > ==20512== by 0x41AE51: handle_internal_command (perf.c:400) > ==20512== by 0x41AE51: run_argv (perf.c:444) > ==20512== by 0x41AE51: main (perf.c:559) #1) Ok, those are the hist_entries that were not decayed, if top continued they would eventually be decayed, freed, etc, i.e. the exit of top is equivalent to the last decay. > ==20512== > ==20512== 8,922,480 bytes in 159,330 blocks are possibly lost in loss record 296 of 301 > ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) > ==20512== by 0x4821FE: zalloc (util.h:189) > ==20512== by 0x4821FE: fill_node (callchain.c:450) > ==20512== by 0x4821FE: add_child (callchain.c:473) > ==20512== by 0x4821FE: append_chain_children (callchain.c:596) > ==20512== by 0x48514E: callchain_append (callchain.c:672) > ==20512== by 0x4A8947: iter_add_single_cumulative_entry (hist.c:739) > ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876) > ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) > ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) > ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) > ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) > ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) > ==20512== by 0x41B702: run_builtin (perf.c:341) > ==20512== by 0x41AE51: handle_internal_command (perf.c:400) > ==20512== by 0x41AE51: run_argv (perf.c:444) > ==20512== by 0x41AE51: main (perf.c:559) > ==20512== > ==20512== 11,050,136 bytes in 1,663 blocks are definitely lost in loss record 297 of 301 > ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) > ==20512== by 0x44F7BF: zalloc (util.h:189) > ==20512== by 0x44F7BF: symbol__alloc_hist (annotate.c:455) > ==20512== by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507) > ==20512== by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521) > ==20512== by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195) > ==20512== by 0x437A65: hist_iter__top_callback (builtin-top.c:688) > ==20512== by 0x4A8AC8: hist_entry_iter__add (hist.c:892) > ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) > ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) > ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) > ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) > ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) > ==20512== by 0x41B702: run_builtin (perf.c:341) > ==20512== by 0x41AE51: handle_internal_command (perf.c:400) > ==20512== by 0x41AE51: run_argv (perf.c:444) > ==20512== by 0x41AE51: main (perf.c:559) These (symbol__alloc_hist) we need to free up the decaying of per RIP annotation gets to zero... Will check where to do this... > ==20512== > ==20512== 24,920,064 bytes in 59,904 blocks are possibly lost in loss record 298 of 301 > ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) > ==20512== by 0x4A6996: zalloc (util.h:189) > ==20512== by 0x4A6996: hist_entry__new (hist.c:309) > ==20512== by 0x4A8249: add_hist_entry (hist.c:431) > ==20512== by 0x4A8249: __hists__add_entry (hist.c:477) > ==20512== by 0x4A8902: iter_add_single_cumulative_entry (hist.c:730) > ==20512== by 0x4A8A64: hist_entry_iter__add (hist.c:876) > ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) > ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) > ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) > ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) > ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) > ==20512== by 0x41B702: run_builtin (perf.c:341) > ==20512== by 0x41AE51: handle_internal_command (perf.c:400) > ==20512== by 0x41AE51: run_argv (perf.c:444) > ==20512== by 0x41AE51: main (perf.c:559) Same thing as #1. > ==20512== > ==20512== 26,147,680 bytes in 62,855 blocks are possibly lost in loss record 299 of 301 > ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) > ==20512== by 0x4A6996: zalloc (util.h:189) > ==20512== by 0x4A6996: hist_entry__new (hist.c:309) > ==20512== by 0x4A8249: add_hist_entry (hist.c:431) > ==20512== by 0x4A8249: __hists__add_entry (hist.c:477) > ==20512== by 0x4A8728: iter_add_next_cumulative_entry (hist.c:803) > ==20512== by 0x4A8AA5: hist_entry_iter__add (hist.c:887) > ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) > ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) > ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) > ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) > ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) > ==20512== by 0x41B702: run_builtin (perf.c:341) > ==20512== by 0x41AE51: handle_internal_command (perf.c:400) > ==20512== by 0x41AE51: run_argv (perf.c:444) > ==20512== by 0x41AE51: main (perf.c:559) > ==20512== > ==20512== 44,939,792 bytes in 3,788 blocks are definitely lost in loss record 300 of 301 > ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) > ==20512== by 0x44F7BF: zalloc (util.h:189) > ==20512== by 0x44F7BF: symbol__alloc_hist (annotate.c:455) > ==20512== by 0x44F7BF: symbol__inc_addr_samples (annotate.c:507) > ==20512== by 0x44F7BF: hist_entry__inc_addr_samples (annotate.c:521) > ==20512== by 0x437A65: perf_top__record_precise_ip (builtin-top.c:195) > ==20512== by 0x437A65: hist_iter__top_callback (builtin-top.c:688) > ==20512== by 0x4A8AC8: hist_entry_iter__add (hist.c:892) > ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) > ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) > ==20512== by 0x4395EE: perf_top__mmap_read (builtin-top.c:871) > ==20512== by 0x4395EE: __cmd_top (builtin-top.c:974) > ==20512== by 0x4395EE: cmd_top (builtin-top.c:1266) > ==20512== by 0x41B702: run_builtin (perf.c:341) > ==20512== by 0x41AE51: handle_internal_command (perf.c:400) > ==20512== by 0x41AE51: run_argv (perf.c:444) > ==20512== by 0x41AE51: main (perf.c:559) > ==20512== > ==20512== 81,248,440 bytes in 1,450,865 blocks are possibly lost in loss record 301 of 301 > ==20512== at 0x402B000: calloc (vg_replace_malloc.c:623) > ==20512== by 0x4821FE: zalloc (util.h:189) > ==20512== by 0x4821FE: fill_node (callchain.c:450) > ==20512== by 0x4821FE: add_child (callchain.c:473) > ==20512== by 0x4821FE: append_chain_children (callchain.c:596) > ==20512== by 0x48514E: callchain_append (callchain.c:672) > ==20512== by 0x4A8768: iter_add_next_cumulative_entry (hist.c:812) > ==20512== by 0x4A8AA5: hist_entry_iter__add (hist.c:887) > ==20512== by 0x43787A: perf_event__process_sample (builtin-top.c:787) > ==20512== by 0x43787A: perf_top__mmap_read_idx (builtin-top.c:854) > ==20512== by 0x43967E: perf_top__mmap_read (builtin-top.c:871) > ==20512== by 0x43967E: __cmd_top (builtin-top.c:996) > ==20512== by 0x43967E: cmd_top (builtin-top.c:1266) > ==20512== by 0x41B702: run_builtin (perf.c:341) > ==20512== by 0x41AE51: handle_internal_command (perf.c:400) > ==20512== by 0x41AE51: run_argv (perf.c:444) > ==20512== by 0x41AE51: main (perf.c:559) > ==20512== > ==20512== LEAK SUMMARY: > ==20512== definitely lost: 56,810,427 bytes in 5,479 blocks > ==20512== indirectly lost: 4,000 bytes in 125 blocks > ==20512== possibly lost: 210,493,048 bytes in 2,552,578 blocks > ==20512== still reachable: 18,258 bytes in 242 blocks > ==20512== suppressed: 0 bytes in 0 blocks > > -- > Markus ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: "perf top -g" leaking ~300MB per second. 2014-12-30 5:38 ` Namhyung Kim 2014-12-30 8:35 ` Markus Trippelsdorf @ 2015-01-08 15:09 ` Arnaldo Carvalho de Melo 2015-01-09 12:31 ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim 2 siblings, 0 replies; 9+ messages in thread From: Arnaldo Carvalho de Melo @ 2015-01-08 15:09 UTC (permalink / raw) To: Namhyung Kim Cc: David Ahern, Markus Trippelsdorf, linux-kernel, Peter Zijlstra, Paul Mackerras, Ingo Molnar Em Tue, Dec 30, 2014 at 02:38:13PM +0900, Namhyung Kim escreveu: > Subject: [PATCH] perf callchain: Free callchains when hist entries are deleted > Markus reported that "perf top -g" can leak ~300MB per second on his > machine. This is partly because it missed to free callchains when > hist entries are deleted. Fix it. Thanks, applied. - Arnaldo ^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted 2014-12-30 5:38 ` Namhyung Kim 2014-12-30 8:35 ` Markus Trippelsdorf 2015-01-08 15:09 ` Arnaldo Carvalho de Melo @ 2015-01-09 12:31 ` tip-bot for Namhyung Kim 2 siblings, 0 replies; 9+ messages in thread From: tip-bot for Namhyung Kim @ 2015-01-09 12:31 UTC (permalink / raw) To: linux-tip-commits Cc: hpa, mingo, linux-kernel, namhyung, acme, fweisbec, dsahern, a.p.zijlstra, markus, tglx, paulus, mingo Commit-ID: d114960c488b5a95705a04bba305f931cef0efd6 Gitweb: http://git.kernel.org/tip/d114960c488b5a95705a04bba305f931cef0efd6 Author: Namhyung Kim <namhyung@kernel.org> AuthorDate: Tue, 30 Dec 2014 14:38:13 +0900 Committer: Arnaldo Carvalho de Melo <acme@redhat.com> CommitDate: Thu, 8 Jan 2015 11:56:35 -0300 perf callchain: Free callchains when hist entries are deleted Markus reported that "perf top -g" can leak ~300MB per second on his machine. This is partly because it missed to free callchains when hist entries are deleted. Fix it. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20141230053813.GD6081@sejong Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/callchain.c | 30 ++++++++++++++++++++++++++++++ tools/perf/util/callchain.h | 2 ++ tools/perf/util/hist.c | 1 + 3 files changed, 33 insertions(+) diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index 64b377e..14e7a12 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -841,3 +841,33 @@ char *callchain_list__sym_name(struct callchain_list *cl, return bf; } + +static void free_callchain_node(struct callchain_node *node) +{ + struct callchain_list *list, *tmp; + struct callchain_node *child; + struct rb_node *n; + + list_for_each_entry_safe(list, tmp, &node->val, list) { + list_del(&list->list); + free(list); + } + + n = rb_first(&node->rb_root_in); + while (n) { + child = container_of(n, struct callchain_node, rb_node_in); + n = rb_next(n); + rb_erase(&child->rb_node_in, &node->rb_root_in); + + free_callchain_node(child); + free(child); + } +} + +void free_callchain(struct callchain_root *root) +{ + if (!symbol_conf.use_callchain) + return; + + free_callchain_node(&root->node); +} diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h index dbc08cf..c0ec1ac 100644 --- a/tools/perf/util/callchain.h +++ b/tools/perf/util/callchain.h @@ -198,4 +198,6 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused, char *callchain_list__sym_name(struct callchain_list *cl, char *bf, size_t bfsize, bool show_dso); +void free_callchain(struct callchain_root *root); + #endif /* __PERF_CALLCHAIN_H */ diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index 0ced178..1823955 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -947,6 +947,7 @@ void hist_entry__free(struct hist_entry *he) zfree(&he->mem_info); zfree(&he->stat_acc); free_srcline(he->srcline); + free_callchain(he->callchain); free(he); } ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-01-09 12:31 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-13 8:48 "perf top -g" leaking ~300MB per second Markus Trippelsdorf 2014-12-13 9:03 ` Markus Trippelsdorf 2014-12-13 15:26 ` Arnaldo Carvalho de Melo 2014-12-13 18:16 ` David Ahern 2014-12-30 5:38 ` Namhyung Kim 2014-12-30 8:35 ` Markus Trippelsdorf 2014-12-30 23:04 ` Arnaldo Carvalho de Melo 2015-01-08 15:09 ` Arnaldo Carvalho de Melo 2015-01-09 12:31 ` [tip:perf/urgent] perf callchain: Free callchains when hist entries are deleted tip-bot for Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox