From: Peter Korsgaard <peter@korsgaard.com>
To: buildroot@busybox.net
Subject: [Buildroot] [PATCH 2/2] core/instrumentation: shave minutes off the build time
Date: Sun, 18 Mar 2018 15:14:40 +0100 [thread overview]
Message-ID: <87muz5fp2n.fsf@dell.be.48ers.dk> (raw)
In-Reply-To: <6a793a6dba4f052ca8bbc35edd63df601f46478b.1521146096.git.yann.morin.1998@free.fr> (Yann E. MORIN's message of "Thu, 15 Mar 2018 21:35:08 +0100")
>>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:
Hi,
> As part of the build, we run some instrumentation hooks to gather
> statistics about the usage of the target/, staging/ and host/
> directories, so that we can generate reports for the user, that
> shows:
> - for each file, what package installed it,
> - for each package,the size that it installed.
> In so doing, we run a double md5 pass on all files of the affected
> directories. These passes were mostly invisible when we were only
> scanning target/, but has greatly increased in time now that we also
> scan staging/ and host/ (but only in the corresponding _CMDS, of
> course).
> This md5 wsa mostly aimed at catching packages that would "cheat" with
> mtime/atime/ctime somehow. They can't really cheat on md5, though [0].
> Timings however speak for themselves, with this defconfig (slightly
> biggish-but-still-manageable build) [1].
> host/ 20965 files 1.2GiB
> staging/ 4715 files 333MiB
> target/ 1801 files 44MiB
> All instrumentation steps, using md5: 19min 27s
> All instrumentation steps, using mtime: 14min 45s
> No instrumentation step at all: 14min 31s
> So, using mtime is an almost-5min improvement, i.e. about 25% faster,
> while removing all instrumentation steps does not gain that much more...
> So, we switch to using mtime, because in the end that's still good-enough
> for our use-case: generating some graphs. It is not mission-critical, and
> if a graph is slightly off, that's not biggy. It can anyway be attributed
> to a broken package's buildsystem, which should get fixed.
> However, we lose the ability to track directories. Non-empty directories
> can be tracked back by a bit of scripting, but empty directories are
> simply not caught. If we were to also look for directories using mtime,
> we would catch parents of installed files:
> - /foo/bar/ exists
> - a package installs /foo/bar/buz
> - mtime of /foo/bar/ is changed to account for the nex file in it.
Playing around with this, I noticed two other issues:
- It doesn't work for packages using rsync to install,
E.G. skeleton-init-common as rsync also sets the mtime to match the
source files
- It breaks for <pkg>-reinstall
I don't think either of those are really big issues compared to the huge
slowdown, but it is worth noticing.
> +define step_pkg_size_inner
> + cd $(2); \
> + find . \( -type f -o -type L \) \
> + -newer $($(PKG)_DIR)/.stamp_built \
> + -exec printf '$(1),%s\n' {} + \
> + >> $(BUILD_DIR)/packages-file-list$(3).txt
What find version are you using? My fileutils find (and the busybox
applet) use 'l' for symlinks, so I've changed it to that.
Committed with that fixed (and a few tweaks to the commit message),
thanks.
--
Bye, Peter Korsgaard
next prev parent reply other threads:[~2018-03-18 14:14 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-15 20:35 [Buildroot] [PATCH 0/2] core/instrumentation: optimisations Yann E. MORIN
2018-03-15 20:35 ` [Buildroot] [PATCH 1/2] core/intrumetnation: don't spawn to get seconds-since-EPOCH Yann E. MORIN
2018-03-17 11:54 ` Cam Hutchison
2018-03-18 16:16 ` Yann E. MORIN
2018-03-19 16:15 ` Trent Piepho
2018-03-15 20:35 ` [Buildroot] [PATCH 2/2] core/instrumentation: shave minutes off the build time Yann E. MORIN
2018-03-18 14:14 ` Peter Korsgaard [this message]
2018-03-18 16:15 ` Yann E. MORIN
2018-03-18 16:33 ` Peter Korsgaard
2018-03-22 16:41 ` Thomas De Schampheleire
2018-03-22 16:50 ` Thomas Petazzoni
2018-03-22 17:11 ` Thomas De Schampheleire
2018-03-22 17:25 ` Trent Piepho
2018-03-22 22:39 ` Peter Korsgaard
2018-03-23 22:39 ` Arnout Vandecappelle
2018-03-23 23:03 ` Thomas Petazzoni
2018-03-19 16:30 ` Trent Piepho
2018-03-19 16:50 ` Thomas Petazzoni
2018-03-19 20:04 ` Peter Korsgaard
2018-03-20 21:47 ` Trent Piepho
2018-03-19 16:53 ` Peter Korsgaard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87muz5fp2n.fsf@dell.be.48ers.dk \
--to=peter@korsgaard.com \
--cc=buildroot@busybox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox