* Performance regression in bitbake and exec() vs fork()
@ 2010-12-09 12:56 Richard Purdie
2010-12-17 22:44 ` Richard Purdie
0 siblings, 1 reply; 5+ messages in thread
From: Richard Purdie @ 2010-12-09 12:56 UTC (permalink / raw)
To: poky
I'd like to update people on where this is at. Just to be clear this has
largely been done by the Pseudo maintainer and Mark Hatle, I just agreed
to summarise things.
A branch of pseudo with the needed updates to be able to enable/disable
upon fork/exec/clone is available in poky-contrib on branch
mhatle/pseudo_git. To build this you need to do something like:
./configure --prefix=<poky dir>/pseudo --bits=64
make
make install
(make test)
go to the poky dir
PSEUDO_DISABLED=1 ./pseudo/bin/pseudo
. ./poky-init-env .....
run bitbake commands as normal
Mark's version of a patch I quickly hacked together to change bitbake to
use fork() again rather than exec is:
http://git.yoctoproject.org/cgit/cgit.cgi/poky-contrib/commit/?h=mhatle/pseudo_fork&id=aa64a80540ea236d8c1e8439dd236badc068749c
The changes in lib/bb/siggen.py are known to be a hack to work around
certain problems and this patch is a work in progress.
Also, I'd like to make it clear, some of the pseudo changes have not yet
been reviewed/accepted by the pseudo maintainer so everything is subject
to change. This email is just to summarise where we're at with this and
bring everyone up to speed and invite others to look at the changes.
The good news is that it appears to roughly work and we get a suitable
speedup back.
Cheers,
Richard
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Performance regression in bitbake and exec() vs fork()
2010-12-09 12:56 Performance regression in bitbake and exec() vs fork() Richard Purdie
@ 2010-12-17 22:44 ` Richard Purdie
2010-12-18 0:46 ` Tian, Kevin
0 siblings, 1 reply; 5+ messages in thread
From: Richard Purdie @ 2010-12-17 22:44 UTC (permalink / raw)
To: poky
To update on the current status of bitbake exec() vs fork(), we now have
the following branch:
http://git.pokylinux.org/cgit.cgi/poky-contrib/log/?h=rpurdie/newpseudo
which changes Poky to use the new version of pseudo, adds in an
appropriate wrapper script for bitbake and starts enabling/disabling
pseudo as appropriate as well as switching back to fork() instead of
exec().
We've still trying to work out exactly what this means for performance.
Things 'feel' very much faster and I've at least one use case showing a
double in speed showing the increase in task execution speed of fairly
empty tasks. My test was:
rm 'sstate-*qemu-config*'
MACHINE=qemux86 bitbake qemu-config -c clean
time MACHINE=qemux86 bitbake qemu-config
On the current master branch the timings were:
real 1m8.529s
user 0m58.870s
sys 0m4.690s
Whilst with newpseudo:
real 0m34.264s
user 0m26.200s
sys 0m2.340s
This is good as it gets us back to a much snappier feeling bitbake.
Mark has some numbers which don't quiet add up with improvements in read
and sys but an increase in user too, we're still looking to understand
them. I'd like to give the autobuilder a pass over these changes when we
have the opportunity and see what that real world performance looks
like.
Its likely that the speedups will be greatest on machines with small
numbers of cores which are primarily cpu bound. The benefits will
decrease on disk IO bound systems with large number of cores.
Cheers,
Richard
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Performance regression in bitbake and exec() vs fork()
2010-12-17 22:44 ` Richard Purdie
@ 2010-12-18 0:46 ` Tian, Kevin
2010-12-20 13:09 ` Richard Purdie
0 siblings, 1 reply; 5+ messages in thread
From: Tian, Kevin @ 2010-12-18 0:46 UTC (permalink / raw)
To: Richard Purdie, poky
>From: Richard Purdie
>Sent: Saturday, December 18, 2010 6:44 AM
>
>To update on the current status of bitbake exec() vs fork(), we now have
>the following branch:
>
>http://git.pokylinux.org/cgit.cgi/poky-contrib/log/?h=rpurdie/newpseudo
>
>which changes Poky to use the new version of pseudo, adds in an
>appropriate wrapper script for bitbake and starts enabling/disabling
>pseudo as appropriate as well as switching back to fork() instead of
>exec().
>
>We've still trying to work out exactly what this means for performance.
>Things 'feel' very much faster and I've at least one use case showing a
>double in speed showing the increase in task execution speed of fairly
>empty tasks. My test was:
>
>rm 'sstate-*qemu-config*'
>MACHINE=qemux86 bitbake qemu-config -c clean
>time MACHINE=qemux86 bitbake qemu-config
>
>On the current master branch the timings were:
>
>real 1m8.529s
>user 0m58.870s
>sys 0m4.690s
>
>Whilst with newpseudo:
>
>real 0m34.264s
>user 0m26.200s
>sys 0m2.340s
>
>This is good as it gets us back to a much snappier feeling bitbake.
that's a really good improvement and glad we're close back to original line. :-)
>
>Mark has some numbers which don't quiet add up with improvements in read
>and sys but an increase in user too, we're still looking to understand
>them. I'd like to give the autobuilder a pass over these changes when we
>have the opportunity and see what that real world performance looks
>like.
>
>Its likely that the speedups will be greatest on machines with small
>numbers of cores which are primarily cpu bound. The benefits will
>decrease on disk IO bound systems with large number of cores.
>
any elaboration on this difference?
Thanks
Kevin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Performance regression in bitbake and exec() vs fork()
2010-12-18 0:46 ` Tian, Kevin
@ 2010-12-20 13:09 ` Richard Purdie
2010-12-20 17:11 ` Richard Purdie
0 siblings, 1 reply; 5+ messages in thread
From: Richard Purdie @ 2010-12-20 13:09 UTC (permalink / raw)
To: Tian, Kevin; +Cc: poky
On Sat, 2010-12-18 at 08:46 +0800, Tian, Kevin wrote:
> >Mark has some numbers which don't quiet add up with improvements in read
> >and sys but an increase in user too, we're still looking to understand
> >them. I'd like to give the autobuilder a pass over these changes when we
> >have the opportunity and see what that real world performance looks
> >like.
> >
> >Its likely that the speedups will be greatest on machines with small
> >numbers of cores which are primarily cpu bound. The benefits will
> >decrease on disk IO bound systems with large number of cores.
> >
>
> any elaboration on this difference?
The exec overhead occurs in the bitbake worker processes. The more
threads that are available, the more work happens in parallel and the
less this overhead can be "seen" in the overall time profile.
Secondly, the fork overhead is 'CPU' based. If the overall build time is
IO bound and not CPU bound, the less you'll see this overhead on a time
profile.
Cheers,
Richard.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Performance regression in bitbake and exec() vs fork()
2010-12-20 13:09 ` Richard Purdie
@ 2010-12-20 17:11 ` Richard Purdie
0 siblings, 0 replies; 5+ messages in thread
From: Richard Purdie @ 2010-12-20 17:11 UTC (permalink / raw)
To: Tian, Kevin; +Cc: poky
Some further numbers:
with exec():
real 71m56.105s
user 209m28.120s
sys 26m16.440s
with fork():
real 66m57.744s
user 183m29.730s
sys 20m37.100s
real 69m48.796s
user 180m57.800s
sys 20m39.620s
for the same workload.
So it looks like some speedup but not as significant as we'd perhaps
have hoped. Statistical fluctuations mean that the real gain is going to
be hard to measure, I could make more timing runs but I'm not sure that
would be that useful. Some of the speedup above could be some of the
improvements made to pseudo too.
Bottom line is the task execution is demonstrably faster though as my
other tests showed :)
Cheers,
Richard
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-12-20 17:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-09 12:56 Performance regression in bitbake and exec() vs fork() Richard Purdie
2010-12-17 22:44 ` Richard Purdie
2010-12-18 0:46 ` Tian, Kevin
2010-12-20 13:09 ` Richard Purdie
2010-12-20 17:11 ` Richard Purdie
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.