All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Arnaldo Carvalho de Melo" <acme@infradead.org>,
	"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>
Subject: Re: [GIT PULL] perf changes for v3.12
Date: Thu, 5 Sep 2013 12:56:39 +0200	[thread overview]
Message-ID: <20130905105639.GB21407@gmail.com> (raw)
In-Reply-To: <CA+55aFyyJpCC0EFfPpGYi+6goWbA+LZbrkFcLNW3x8xYHfKvdQ@mail.gmail.com>


(Cc:-ed Frederic and Namhyung as well, it's about bad overhead in 
tools/perf/util/hist.c.)

* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Sep 3, 2013 at 6:29 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > Please pull the latest perf-core-for-linus git tree from:
> 
> I don't think this is new at all, but I just tried to do a perf
> record/report of "make -j64 test" on git:
> 
> It's a big perf.data file (1.6G), but after it has done the
> "processing time ordered events" thing it results in:
> 
> ┌─Warning:───────────────────────────────────┐
> │Processed 8672030 events and lost 71 chunks!│
> │Check IO/CPU overload!                      │
> │                                            │
> │                                            │
> │Press any key...                            │
> └────────────────────────────────────────────┘
> 
> and then it just hangs using 100% CPU time. Pressing any key doesn't
> do anything.
> 
> It may well still be *doing* something, and maybe it will come back
> some day with results. But it sure doesn't show any indication that it
> will.
> 
> Try this (in a current git source tree: note, by "git" I actually mean
> git itself, not some random git repository)::
> 
>     perf record -g -e cycles:pp make -j64 test >& out
>     perf report
> 
> maybe you can reproduce it.

I managed to reproduce it on a 32-way box via:

     perf record -g make -j64 bzImage >/dev/null 2>&1

It's easier to debug it without the TUI:

     perf --no-pages report --stdio

It turns out that even with a 400 MB perf.data the 'perf report' call will 
eventually finish - here it ran for almost half an hour(!) on a fast box.

Arnaldo, the large overhead is in hists__collapse_resort(), in particular 
it's doing append_chain_children() 99% of the time:

-  99.74%  perf  perf               [.] append_chain_children                                                         ◆
   - append_chain_children                                                                                            ▒
      - 99.76% merge_chain_branch                                                                                     ▒
         - merge_chain_branch                                                                                         ▒
            + 98.04% hists__collapse_resort                                                                           ▒
            + 1.96% merge_chain_branch                                                                                ▒
+   0.05%  perf  perf               [.] merge_chain_branch                                                            ▒
+   0.03%  perf  libc-2.17.so       [.] _int_free                                                                     ▒
+   0.03%  perf  libc-2.17.so       [.] __libc_calloc                                                                 ▒
+   0.02%  perf  [kernel.kallsyms]  [k] account_user_time                                                             ▒
+   0.02%  perf  libc-2.17.so       [.] _int_malloc                                                                   ▒

It seems to be stuck in hists__collapse_resort().

In particular the overhead arises because the following loop in 
append_chain_children():

        /* lookup in childrens */
        chain_for_each_child(rnode, root) {
                unsigned int ret = append_chain(rnode, cursor, period);

Reaches very long counts and the algorithm gets quadratic (at least). The 
child count reaches over 100,000 entries in the end (!).

I don't think the high child count in itself is anomalous: a kernel build 
generates thousands of processes, tons of symbol ranges and tens of 
millions of call chain entries.

So I think what we need here is to speed up the lookup: put children into 
a secondary, ->pos,len indexed range-rbtree and do a binary search instead 
of a linear search over 100,000 child entries ... or something like that.

Btw., a side note, append_chain() is a rather confusing function in 
itself, with logic-inversion gems like:

                if (!found)
                        found = true;

All that should be cleaned up as well I guess.

The 'IO overload' message appears to be a separate, unrelated bug, it just 
annoyingly does not get refreshed away in the TUI before 
hists__collapse_resort() is called, and there's also no progress bar for 
the hists__collapse_resort() pass, so to the user it all looks like a 
deadlock.

So there's at least two bugs here:

  - the bad overhead in hists__collapse_resort()

  - bad usability if hists__collapse_resort() takes more than 1 second to finish

Thanks,

	Ingo

  reply	other threads:[~2013-09-05 10:56 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-03 13:29 [GIT PULL] perf changes for v3.12 Ingo Molnar
2013-09-03 13:37 ` Arnaldo Carvalho de Melo
2013-09-03 13:43   ` Ingo Molnar
2013-09-03 17:02 ` Vince Weaver
2013-09-04 17:53 ` Linus Torvalds
2013-09-05 10:56   ` Ingo Molnar [this message]
2013-09-05 12:42     ` Frederic Weisbecker
2013-09-05 12:51       ` Ingo Molnar
2013-09-05 12:58         ` Frederic Weisbecker
2013-09-10  8:06       ` Namhyung Kim
2013-09-10 11:18         ` Frederic Weisbecker
2013-09-05 13:38 ` Ingo Molnar
2013-09-08  2:17 ` Linus Torvalds
2013-09-09 10:05   ` Peter Zijlstra
2013-09-10 11:28     ` Stephane Eranian
2013-09-10 11:53       ` PEBS bug on HSW: "Unexpected number of pebs records 10" (was: Re: [GIT PULL] perf changes for v3.12) Ingo Molnar
2013-09-10 12:32         ` Stephane Eranian
2013-09-10 12:42           ` Ramkumar Ramachandra
2013-09-10 12:51           ` Ramkumar Ramachandra
2013-09-10 12:55             ` Stephane Eranian
2013-09-10 13:22               ` Ingo Molnar
2013-09-10 13:38           ` Ingo Molnar
2013-09-10 14:15             ` Stephane Eranian
2013-09-10 14:29               ` Ingo Molnar
2013-09-10 14:34                 ` Stephane Eranian
2013-09-10 17:14                   ` Ingo Molnar
2013-09-16 11:07                     ` Stephane Eranian
2013-09-16 15:41                       ` Ingo Molnar
2013-09-16 16:29                         ` Peter Zijlstra
2013-09-17  7:00                           ` Ingo Molnar
2013-09-23 15:25                           ` Stephane Eranian
2013-09-23 15:33                             ` Peter Zijlstra
2013-09-23 17:11                               ` Stephane Eranian
2013-09-23 17:24                                 ` Peter Zijlstra
2013-09-10 15:28               ` Peter Zijlstra
2013-09-10 16:14                 ` Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130905105639.GB21407@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.