From: Mark Hatle <mark.hatle@windriver.com>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: poky <poky@yoctoproject.org>
Subject: Re: Quick hack for profiling tasks
Date: Mon, 31 Jan 2011 19:43:29 -0600 [thread overview]
Message-ID: <4D476541.7040703@windriver.com> (raw)
In-Reply-To: <1296520850.13501.16052.camel@rex>
On 1/31/11 6:40 PM, Richard Purdie wrote:
> On Tue, 2011-02-01 at 00:28 +0000, Richard Purdie wrote:
>> One thing that is bugging me whilst I've been debugging some issues
>> we're having with the libc/libgcc package dependency issue is how long
>> do_package takes for libc. The question is where does it spend the time?
>> Answer, I have no idea.
>>
>> I hacked together the patch below to find out. Its ugly and uses the
>> boilerplate profiling code from cooker, cut and pasted here to profile
>> the actual tasks that run.
>>
>> I've yet to look at the results but it should allow us to optimise the
>> python tasks a bit if we can see where they spend time. I'm hoping this
>> lets others look at that too and also it give us some hints as to how we
>> might improve the core when turning on profiling in bitbake.
>
> For eglibc this worked out as:
>
> Tue Feb 1 00:33:21 2011 profile-eglibc_2.12.bb-do_package.log
>
> 8339733 function calls (8001600 primitive calls) in 877.972 CPU seconds
>
> Ordered by: internal time
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 3206 321.887 0.100 322.422 0.101 package_do_filedeps:12(process_deps)
> 403 311.208 0.772 311.208 0.772 {posix.waitpid}
> 134054 69.860 0.001 69.860 0.001 {method 'read' of 'file' objects}
> 225554 23.367 0.000 23.367 0.000 {posix.stat}
> 866 20.279 0.023 20.279 0.023 {posix.system}
> 85562 19.406 0.000 19.406 0.000 {posix.chmod}
> 168083 16.691 0.000 16.691 0.000 {posix.lstat}
> 25824 14.399 0.001 14.399 0.001 {posix.rename}
> 55391 13.731 0.000 13.731 0.000 {open}
> 5325 9.019 0.002 9.019 0.002 {posix.popen}
> 2279 5.490 0.002 5.490 0.002 {method 'readlines' of 'file' objects}
> 6403 5.187 0.001 6.346 0.001 insane.bbclass:1(package_qa_hash_style)
> 19214 5.046 0.000 5.046 0.000 {posix.mkdir}
>
> so its spending a third of the time in package_do_filedeps(), a lot of
> which is in waitpid waiting for the process that was spawned.
>
> Mark: Is there a way we could batch up the information rather than go
> file by file? I'm going to look at this for other areas to improve too
> but thats obviously one worthy of attention.
The way the routine works today is via a script call perfile_rpmdeps.sh. The
package.bbclass calls this script twice for each package-split.
I.e. if we have base, base-dbg and base-libs, it will run for a total of 6
times. Each pair it is simply passed the path to the packages-split directory.
Within the script itself, it is doing a find operation:
find "$@" | process $process_type
The output of the file is passed to "process" which is just a wrapper that calls
the rpmdeps program with the correct parameters. We could optimize this a bit
by ignoring directories and symlinks. But we still want to process all of the
files in the system.
Another optimization (that we do NOT have) that is done by default in RPM, is to
only process files that are +x. We have chosen not to do this as most of our
libraries are not set +x. An alternative is to use 'file' and check the type of
each file, however identifying the file type is likely to take longer then
simply running the per-file deps commands. Another possible optimization is
only scan certain directories (or the opposite, skip certain directories...) the
only issue here is missing files that may be dlopened or loaded via RPATH
because they are in a non-standard location.
So what I'd recommend is we start by adding "-type f" to the find. That is
likely to help some.
Maybe then add a check for either sitting in /lib or /usr/lib _or_ mode is +x?
We would likely need to audit the system somehow and tag ELF files that are
neither...
--Mark
> Cheers,
>
> Richard
>
next prev parent reply other threads:[~2011-02-01 1:43 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-01 0:28 Quick hack for profiling tasks Richard Purdie
2011-02-01 0:40 ` Richard Purdie
2011-02-01 1:43 ` Mark Hatle [this message]
2011-02-01 11:05 ` Richard Purdie
2011-02-01 15:38 ` Mark Hatle
2011-02-02 14:28 ` Richard Purdie
2011-02-03 11:22 ` Richard Purdie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D476541.7040703@windriver.com \
--to=mark.hatle@windriver.com \
--cc=poky@yoctoproject.org \
--cc=richard.purdie@linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.