Buildroot Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
To: buildroot@busybox.net
Subject: [Buildroot] [PATCH 2/2] core/instrumentation: shave minutes off the build time
Date: Thu, 22 Mar 2018 18:11:59 +0100	[thread overview]
Message-ID: <20180322171159.GZ14461@australia> (raw)
In-Reply-To: <20180322175035.2410072a@windsurf>

On Thu, Mar 22, 2018 at 05:50:35PM +0100, Thomas Petazzoni wrote:
> Hello,
> 
> On Thu, 22 Mar 2018 17:41:44 +0100, Thomas De Schampheleire wrote:
> 
> > It really depends on what you use these files for.
> > 
> > The original use case for the target list was rootfs size analysis. In the
> > discussion I have seen comments like missing a few files is not that important
> > here, but I disagree: if the missing file is 2MB large, it is a big problem.
> > 
> > Another use in-tree is to check for check-uniq-files. While this is a
> > non-critical feature, it's a pity if it would not detect problems because the
> > lists are inaccurate.
> > 
> > But there are out-of-tree uses too.  The most obvious usage is simply to
> > understand which package was responsible for a given file, even separate from
> > size analysis.
> > 
> > But there are also derived use cases. For example we are using the target list
> > in order to extract some packages from the root filesystem. For example, instead
> > of on the root filesystem (initramfs or NOR flash), they should end up on the
> > NAND flash. A script gets as input the list of packages to extract this way, and
> > uses the list to get the right associated files.
> > 
> > I'm sure there are other use cases.
> > 
> > The current timestamp-based approach not guaranteeing an accurate list is
> > problematic for many such uses. And as you already mentioned, since we don't have
> > full control over the build steps done in any given package, we don't know which
> > timestamps they will use. There may be very good reasons to install certain
> > files with their original timestamp and not the one from the build.
> 
> These are all valid concerns, but what do you suggest ?
> 
> The current approach of hashing all files clearly doesn't scale, as a
> significant amount of build time is now spent on hashing files.
> 

I can only observe that previously, when we still only listed the target files,
the impact did not seem to be that bad, and the concerns about impact on build
time arose with the creation of staging and host lists.
(I hope I caught this correctly from the discussions, I did not yet do
measurements myself. I just saw several differences in the list files when
applying this patch on top of 2018.02).

So one possible alternative is to go back to a situation where only target files
are listed, or make the different lists optional. Users that want the lists and
are ready to accept build time impact, can enable it. Those that don't care
about the lists and just want a fast build, can disable it.
We'd loose the feature of check-uniq-files in case a list is not present, of
course.

Yet another alternative could be to have a different method depending on the
list. Although I personally think that all lists should be accurate, if they are
created.

Another approach: just do a find without md5 (possibly depending on some
option). If all you care about is an accurate list of who created a file but
don't care that much about others possibly overwriting one, then a simple find
is enough and normally quite fast.

Best regards,
Thomas

  reply	other threads:[~2018-03-22 17:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-15 20:35 [Buildroot] [PATCH 0/2] core/instrumentation: optimisations Yann E. MORIN
2018-03-15 20:35 ` [Buildroot] [PATCH 1/2] core/intrumetnation: don't spawn to get seconds-since-EPOCH Yann E. MORIN
2018-03-17 11:54   ` Cam Hutchison
2018-03-18 16:16     ` Yann E. MORIN
2018-03-19 16:15       ` Trent Piepho
2018-03-15 20:35 ` [Buildroot] [PATCH 2/2] core/instrumentation: shave minutes off the build time Yann E. MORIN
2018-03-18 14:14   ` Peter Korsgaard
2018-03-18 16:15     ` Yann E. MORIN
2018-03-18 16:33       ` Peter Korsgaard
2018-03-22 16:41         ` Thomas De Schampheleire
2018-03-22 16:50           ` Thomas Petazzoni
2018-03-22 17:11             ` Thomas De Schampheleire [this message]
2018-03-22 17:25               ` Trent Piepho
2018-03-22 22:39             ` Peter Korsgaard
2018-03-23 22:39           ` Arnout Vandecappelle
2018-03-23 23:03             ` Thomas Petazzoni
2018-03-19 16:30   ` Trent Piepho
2018-03-19 16:50     ` Thomas Petazzoni
2018-03-19 20:04       ` Peter Korsgaard
2018-03-20 21:47         ` Trent Piepho
2018-03-19 16:53     ` Peter Korsgaard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180322171159.GZ14461@australia \
    --to=thomas.de_schampheleire@nokia.com \
    --cc=buildroot@busybox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox