From: Martin Jansa <martin.jansa@gmail.com>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: OE-core <openembedded-core@lists.openembedded.org>,
Peter Kjellerstedt <peter.kjellerstedt@axis.com>
Subject: Re: Long delays with latest bitbake (was: [PATCH 1/7] insane.bbclass: in file-rdeps do not look into RDEPENDS recursively)
Date: Fri, 16 Aug 2019 12:24:41 +0200 [thread overview]
Message-ID: <20190816102441.GB1515@jama> (raw)
In-Reply-To: <20190815150548.GA1515@jama>
[-- Attachment #1: Type: text/plain, Size: 7801 bytes --]
On Thu, Aug 15, 2019 at 05:05:48PM +0200, Martin Jansa wrote:
> On Tue, Aug 13, 2019 at 10:04:08AM +0100, Richard Purdie wrote:
> > On Mon, 2019-08-12 at 20:26 +0000, Peter Kjellerstedt wrote:
> > > Comparing that build to a corresponding do-nothing build with Thud,
> > > the time difference matches those three minutes where I have no idea
> > > what bitbake is doing now that it didn’t need to do before…
> > >
> > > Hopefully these time degradations can be solved, because the current
> > > state of bitbake is barely usable. We also need to look into possible
> > > ways to improve the cooker output when it is running the setscene
> > > tasks so it makes some kind of sense again.
> >
> > We talked on irc and you pointed at the commit things started to go
> > wrong. Just to summarise things for the benefit of the list, this is
> > some quick testing I did:
> >
> > "bitbake -p; time bitbake core-image-minimal -n"
> >
> > 30.0s 6c7c0cefd34067311144a1d4c01986fe0a4aef26
> > 30.6s a0d941c787cf3ef030d190903279d311bc05d752
> > 40.3s 7df31ff36892c2f9c65326b06b4c70
> > 42.2s a0542ed3ff700eca35f9195f743c9e28bcd50f3e
> > 45.4s 9983b07fffd19082abded7c3f15cc77d306dd69c
> > 76.9s master-next
>
> I know I'm late to the party, but it took really long for the test to finish..
>
> I'm not using poky, so reproduce this testing on our builds I've used
> bitbake revisions matching with poky revision RP was testing:
>
> + oe-core 18cdc087fd5da30e2b31f3d4e81b153cd36ca844 for older bitbake, because current master isn't compatible with old bitbake (bb.data_smart.ExpansionError: Failure expanding variable OE_IMPORTED[:=], expression was ${@oe_import(d)} which triggered exception AttributeError: module 'bb.siggen' has no attribute 'SignatureGeneratorUniHashMixIn')
> and oe-core 23db236a054ee7a989cdbbcb42ad5c6eefd4a6ae for newer bitbake (to possibly eliminate impact of metadata changes)
>
> tested on 72core (Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz) with 380GB RAM
>
> nothing in TMPDIR/SSTATE_DIR, no SSTATE_MIRRO, no Hash Equivalence Server configured
>
> on a build with few more layers:
> Parsing of 3238 .bb files complete (0 cached, 3238 parsed). 7632 targets, 1927 skipped, 54 masked, 0 errors.
>
> but first doing just core-image-minimal like RP did:
>
> time poky bitbake oe-core
> 2m8.191s 6c7c0cefd34067311144a1d4c01986fe0a4aef26 18d4a31fdcec1f0e5d2199d6142f0ce833fca1a7 18cdc087fd5da30e2b31f3d4e81b153cd36ca844
> N/A a0d941c787cf3ef030d190903279d311bc05d752 doesn't exist in poky/poky-contrib 18cdc087fd5da30e2b31f3d4e81b153cd36ca844
> 2m17.053s 7df31ff36892c2f9c65326b06b4c70 1f630fdf0260db08541d3ca9f25f852931c19905 18cdc087fd5da30e2b31f3d4e81b153cd36ca844
> 2m18.515s a0542ed3ff700eca35f9195f743c9e28bcd50f3e f43778c2d19e70d4befd483b06aaf247fc65c799 23db236a054ee7a989cdbbcb42ad5c6eefd4a6ae
> 2m22.220s 9983b07fffd19082abded7c3f15cc77d306dd69c 8c26b451f22193ef1c544e2017cc84515566c1b8
> 2m38.185s master-next fbcb89dc3dbabfc80310e9a4ebe72d919300a32e
> cache.py:446: ResourceWarning: unclosed <socket.socket fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 41117)>
> value = pickled.load()
> started showing with this revision
> 2m17.991s master-next + fixups f7cd14f90463957b3e4be6d3876def98b78f1424
> 2m9.651s master-next + "Holding off tasks %s" out
>
> now world
> 253m36.637s 7df31ff36892c2f9c65326b06b4c70 1f630fdf0260db08541d3ca9f25f852931c19905 18cdc087fd5da30e2b31f3d4e81b153cd36ca844
> 174m13.324s a0542ed3ff700eca35f9195f743c9e28bcd50f3e f43778c2d19e70d4befd483b06aaf247fc65c799 23db236a054ee7a989cdbbcb42ad5c6eefd4a6ae
> this is time when killed at "NOTE: Executing Tasks" without showing any progress on the tasks (unlike other tests) and only one bitbake process is running
> 633m27.475s master-next fbcb89dc3dbabfc80310e9a4ebe72d919300a32e
> this is time when killed at (1417 from 71749) - running very slowly unlike other bitbake revisions where the task number changes relatively quickly - about 10 tasks per second
> 89m13.992s master-next + "Holding off tasks %s" out
> 89m59.201s master-next updated today 85fe627fdb6510f0942917964386fad9d8c479c8
>
> Interestingly old 1f630fdf0260db08541d3ca9f25f852931c19905 is 3 times
> slower than recent master-next.
>
> I'll send -P output next.
So 1f630fdf0260db08541d3ca9f25f852931c19905 was already "bad", because
this commit introduced that
logger.debug(2, "Holding off tasks %s" % pprint.pformat(self.holdoff_tasks))
Here is -P output from it:
5min (til "Executing Tasks" message):
profile.log.processed.1f630fdf0260db08541d3ca9f25f852931c19905.return.at.Executing.Tasks
http://paste.ubuntu.com/p/Nw6n5hTPmF/
5hours (whole dry run):
profile.log.processed.1f630fdf0260db08541d3ca9f25f852931c19905.whole.dry.run
http://paste.ubuntu.com/p/sydFGvYTxQ/
profile.log.processed.1f630fdf0260db08541d3ca9f25f852931c19905.worker.log.processed
http://paste.ubuntu.com/p/RMGbm9DyZz/
So for good old baseline I need to go few commits back, to e.g. 18d4a31f RP was testing before:
1f630fdf runqueue: Enable dynamic task adjustment to hash equivalency
a0d941c7 siggen: Convert to use self.unitaskhashes
1f326f2c siggen: Add new unitaskhashes data variable which is cached
b707d0cb runqueue: Improve scenequeue processing logic
19a6e356 cache: Add SimpleCache class
7bb79099 siggen: Import unihash code from OE-Core
7aec8632 hashserv: SQL Optimizations
18d4a31f fetch2/wget: avoid 'maximum recursion depth' RuntimeErrors when handling 403 codes
94m19.081s
profile.log.processed.18d4a31fdcec1f0e5d2199d6142f0ce833fca1a7.whole.dry.run
http://paste.ubuntu.com/p/Jnvv8GJNKB/
And here is profile with latest master from yesterday:
90mins:
profile.log.processed.a5e084a266f63c2fd370122327615e49beaeb94e.whole.dry.run
http://paste.ubuntu.com/p/YhMCHPmg56/
So good news is that current master is a bit faster than "good old" 18d4a31fdcec1f0e5d2199d6142f0ce833fca1a7.
But still significantly slower on my fast builder compared to world tests on poky + meta-oe,
not sure if it's all just because the size of metadata in my builds (with 71K tasks).
Even the dry running the tasks was about 10 tasks/second while RP was seeing around 200 tasks/second.
Will try to bump BB_NUMBER_THREADS from 8 to 72.
I've already disabled rm_work and PRserv, but that doesn't seem to have any significant impact on performance here. Other than resolving the PRserv terminating and then warnings about lost connection.
Cheers,
>
> >
> > So basically the original changes showed a 25% hit but the performance
> > of -next is dire.
> >
> > This is with no hash equiv server configured.
> >
> > It will vary depending on the target used (numbers with -sato for the
> > above would be interesting for comparision) and how much was or is in
> > sstate, they type of sstate mirror configured and so on.
> >
> > I really need to focus on getting the new code functioning correctly
> > before we attempt to optimise but if nobody tests the new code due to
> > performance problems we have a different issue. We also have a scaling
> > problem with the hash server itself I need to fix to stop the
> > autobuilder throwing weird errors. I'm therefore a bit challenged on
> > where to start with it all :/.
> >
> > Cheers,
> >
> > Richard
> >
> >
> >
> > --
> > _______________________________________________
> > Openembedded-core mailing list
> > Openembedded-core@lists.openembedded.org
> > http://lists.openembedded.org/mailman/listinfo/openembedded-core
>
> --
> Martin 'JaMa' Jansa jabber: Martin.Jansa@gmail.com
--
Martin 'JaMa' Jansa jabber: Martin.Jansa@gmail.com
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 201 bytes --]
next prev parent reply other threads:[~2019-08-16 10:24 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-12 20:26 Long delays with latest bitbake (was: [PATCH 1/7] insane.bbclass: in file-rdeps do not look into RDEPENDS recursively) Peter Kjellerstedt
2019-08-13 9:04 ` Richard Purdie
2019-08-13 13:20 ` Alexander Kanavin
2019-08-13 19:18 ` Richard Purdie
2019-08-14 4:06 ` Khem Raj
2019-08-14 11:25 ` Alexander Kanavin
2019-08-14 11:36 ` richard.purdie
2019-08-14 12:08 ` Alexander Kanavin
2019-08-14 12:43 ` Alexander Kanavin
2019-08-14 12:50 ` Mikko.Rapeli
2019-08-14 12:55 ` richard.purdie
2019-08-14 14:57 ` Peter Kjellerstedt
2019-08-14 15:19 ` Khem Raj
2019-08-14 20:27 ` Alexander Kanavin
2019-08-14 21:25 ` richard.purdie
2019-08-15 12:56 ` Alexander Kanavin
2019-08-15 13:56 ` richard.purdie
2019-08-15 22:44 ` Richard Purdie
2019-08-15 15:05 ` Martin Jansa
2019-08-16 10:24 ` Martin Jansa [this message]
2019-08-16 15:00 ` Martin Jansa
2019-08-16 15:54 ` richard.purdie
2019-08-16 17:22 ` Martin Jansa
2019-08-19 21:23 ` Martin Jansa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190816102441.GB1515@jama \
--to=martin.jansa@gmail.com \
--cc=openembedded-core@lists.openembedded.org \
--cc=peter.kjellerstedt@axis.com \
--cc=richard.purdie@linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox