* [bitbake][kirkstone][2.0][PATCH 0/4] Patch review
@ 2023-02-14 15:28 Steve Sakoman
2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation Steve Sakoman
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw)
To: bitbake-devel
Please review this set of patches for 2.0/kirkstone and have comments back by
end of day Thursday.
Passed a-full on autobuilder:
https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/4921
The following changes since commit 86f2fa5261da959cda706c794a0047e5e89d4d6b:
fetch2/git: Clarify the meaning of namespace (2023-02-01 09:03:36 -1000)
are available in the Git repository at:
https://git.openembedded.org/bitbake-contrib stable/2.0-nut
http://cgit.openembedded.org/bitbake-contrib/log/?h=stable/2.0-nut
Etienne Cordonnier (1):
siggen: Fix inefficient string concatenation
Marius Kriegerowski (1):
bitbake-diffsigs: Make PEP8 compliant
Mark Hatle (1):
utils/ply: Update md5 to better report errors with hashlib
Schmidt, Adriaan (1):
bitbake-diffsigs: break on first dependent task difference
bin/bitbake-diffsigs | 49 ++++++++++++++++++++++++--------------------
lib/bb/siggen.py | 11 +++++-----
lib/bb/utils.py | 7 ++++++-
lib/ply/yacc.py | 7 +++++++
4 files changed, 46 insertions(+), 28 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread* [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation 2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman @ 2023-02-14 15:28 ` Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 2/4] utils/ply: Update md5 to better report errors with hashlib Steve Sakoman ` (2 subsequent siblings) 3 siblings, 0 replies; 6+ messages in thread From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw) To: bitbake-devel From: Etienne Cordonnier <ecordonnier@snap.com> As discussed in https://stackoverflow.com/a/4435752/1710392 , CPython has an optimization for statements in the form "a = a + b" or "a += b". It seems that this line does not get optimized, because it has a form a = a + b + c: data = data + "./" + f.split("/./")[1] For that reason, it does a copy of data for each iteration, potentially copying megabytes of data for each iteration. Changing this line causes SignatureGeneratorBasic::get_taskhash to take 0.06 seconds instead of 45 seconds on my test setup where SRC_URI points to a big directory. Note that PEP8 recommends explicitely not to use this optimization which is specific to CPython: "do not rely on CPython’s efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b" However, the PEP8 recommended form using "join()" also does not avoid the copy and takes 45 seconds in my test setup: data = ''.join((data, "./", f.split("/./")[1])) I have changed the other lines to also use += for consistency only, however those were in the form a = a + b and were optimized already. Co-authored-by: JJ Robertson <jrobertson@snap.com> Signed-off-by: Etienne Cordonnier <ecordonnier@snap.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> (cherry picked from commit 195750f2ca355e29d51219c58ecb2c1d83692717) Signed-off-by: Steve Sakoman <steve@sakoman.com> --- lib/bb/siggen.py | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py index 9a20fc8e..cea3a538 100644 --- a/lib/bb/siggen.py +++ b/lib/bb/siggen.py @@ -329,19 +329,19 @@ class SignatureGeneratorBasic(SignatureGenerator): data = self.basehash[tid] for dep in self.runtaskdeps[tid]: - data = data + self.get_unihash(dep) + data += self.get_unihash(dep) for (f, cs) in self.file_checksum_values[tid]: if cs: if "/./" in f: - data = data + "./" + f.split("/./")[1] - data = data + cs + data += "./" + f.split("/./")[1] + data += cs if tid in self.taints: if self.taints[tid].startswith("nostamp:"): - data = data + self.taints[tid][8:] + data += self.taints[tid][8:] else: - data = data + self.taints[tid] + data += self.taints[tid] h = hashlib.sha256(data.encode("utf-8")).hexdigest() self.taskhash[tid] = h -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bitbake][kirkstone][2.0][PATCH 2/4] utils/ply: Update md5 to better report errors with hashlib 2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation Steve Sakoman @ 2023-02-14 15:28 ` Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 3/4] bitbake-diffsigs: Make PEP8 compliant Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 4/4] bitbake-diffsigs: break on first dependent task difference Steve Sakoman 3 siblings, 0 replies; 6+ messages in thread From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw) To: bitbake-devel From: Mark Hatle <mark.hatle@kernel.crashing.org> In the case where hashlib is not available, the try would fail and fall through resulting in a backtrace on the usage of the 'sig'. The backtrace itself was confusing and made it difficult to determine what went wrong. Update the import to be in it's own try block with an appropriate message to indicate what went wrong. Note, the current version of ply all of this code has been restructured so this is not applicable upstream. Additionally, some versions of hashlib don't appear to implement the second FIPS related argument. Detect this and support both versions. Signed-off-by: Mark Hatle <mark.hatle@amd.com> Signed-off-by: Mark Hatle <mark.hatle@kernel.crashing.org> Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> (cherry picked from commit 484ab42f440070c0369b81f5c69da860fa47a798) Signed-off-by: Steve Sakoman <steve@sakoman.com> --- lib/bb/utils.py | 7 ++++++- lib/ply/yacc.py | 7 +++++++ 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/lib/bb/utils.py b/lib/bb/utils.py index 66a8a08c..bca4830f 100644 --- a/lib/bb/utils.py +++ b/lib/bb/utils.py @@ -545,7 +545,12 @@ def md5_file(filename): Return the hex string representation of the MD5 checksum of filename. """ import hashlib - return _hasher(hashlib.new('MD5', usedforsecurity=False), filename) + try: + sig = hashlib.new('MD5', usedforsecurity=False) + except TypeError: + # Some configurations don't appear to support two arguments + sig = hashlib.new('MD5') + return _hasher(sig, filename) def sha256_file(filename): """ diff --git a/lib/ply/yacc.py b/lib/ply/yacc.py index 767c4e46..381b50cf 100644 --- a/lib/ply/yacc.py +++ b/lib/ply/yacc.py @@ -2798,7 +2798,14 @@ class ParserReflect(object): def signature(self): try: import hashlib + except ImportError: + raise RuntimeError("Unable to import hashlib") + try: sig = hashlib.new('MD5', usedforsecurity=False) + except TypeError: + # Some configurations don't appear to support two arguments + sig = hashlib.new('MD5') + try: if self.start: sig.update(self.start.encode('latin-1')) if self.prec: -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bitbake][kirkstone][2.0][PATCH 3/4] bitbake-diffsigs: Make PEP8 compliant 2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 2/4] utils/ply: Update md5 to better report errors with hashlib Steve Sakoman @ 2023-02-14 15:28 ` Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 4/4] bitbake-diffsigs: break on first dependent task difference Steve Sakoman 3 siblings, 0 replies; 6+ messages in thread From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw) To: bitbake-devel From: Marius Kriegerowski <marius.kriegerowski@gmail.com> This ignores flake8 rules: * E402 module level import not at top of file * E501 line too long Signed-off-by: Marius Kriegerowski <marius.kriegerowski@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> (cherry picked from commit e8b176de448dc387c7a578c92b52aef28591038f) Signed-off-by: Steve Sakoman <steve@sakoman.com> --- bin/bitbake-diffsigs | 49 ++++++++++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 22 deletions(-) diff --git a/bin/bitbake-diffsigs b/bin/bitbake-diffsigs index cf4cc706..fe0f33ee 100755 --- a/bin/bitbake-diffsigs +++ b/bin/bitbake-diffsigs @@ -11,6 +11,7 @@ import os import sys import warnings + warnings.simplefilter("default") import argparse import logging @@ -27,6 +28,7 @@ logger = bb.msg.logger_create(myname) is_dump = myname == 'bitbake-dumpsig' + def find_siginfo(tinfoil, pn, taskname, sigs=None): result = None tinfoil.set_event_mask(['bb.event.FindSigInfoResult', @@ -52,6 +54,7 @@ def find_siginfo(tinfoil, pn, taskname, sigs=None): sys.exit(2) return result + def find_siginfo_task(bbhandler, pn, taskname, sig1=None, sig2=None): """ Find the most recent signature files for the specified PN/task """ @@ -63,10 +66,10 @@ def find_siginfo_task(bbhandler, pn, taskname, sig1=None, sig2=None): if not sigfiles: logger.error('No sigdata files found matching %s %s matching either %s or %s' % (pn, taskname, sig1, sig2)) sys.exit(1) - elif not sig1 in sigfiles: + elif sig1 not in sigfiles: logger.error('No sigdata files found matching %s %s with signature %s' % (pn, taskname, sig1)) sys.exit(1) - elif not sig2 in sigfiles: + elif sig2 not in sigfiles: logger.error('No sigdata files found matching %s %s with signature %s' % (pn, taskname, sig2)) sys.exit(1) latestfiles = [sigfiles[sig1], sigfiles[sig2]] @@ -88,9 +91,9 @@ def recursecb(key, hash1, hash2): recout = [] if not hashfiles: recout.append("Unable to find matching sigdata for %s with hashes %s or %s" % (key, hash1, hash2)) - elif not hash1 in hashfiles: + elif hash1 not in hashfiles: recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash1)) - elif not hash2 in hashfiles: + elif hash2 not in hashfiles: recout.append("Unable to find matching sigdata for %s with hash %s" % (key, hash2)) else: out2 = bb.siggen.compare_sigfiles(hashfiles[hash1], hashfiles[hash2], recursecb, color=color) @@ -110,36 +113,36 @@ parser.add_argument('-D', '--debug', if is_dump: parser.add_argument("-t", "--task", - help="find the signature data file for the last run of the specified task", - action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname')) + help="find the signature data file for the last run of the specified task", + action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname')) parser.add_argument("sigdatafile1", - help="Signature file to dump. Not used when using -t/--task.", - action="store", nargs='?', metavar="sigdatafile") + help="Signature file to dump. Not used when using -t/--task.", + action="store", nargs='?', metavar="sigdatafile") else: parser.add_argument('-c', '--color', - help='Colorize the output (where %(metavar)s is %(choices)s)', - choices=['auto', 'always', 'never'], default='auto', metavar='color') + help='Colorize the output (where %(metavar)s is %(choices)s)', + choices=['auto', 'always', 'never'], default='auto', metavar='color') parser.add_argument('-d', '--dump', - help='Dump the last signature data instead of comparing (equivalent to using bitbake-dumpsig)', - action='store_true') + help='Dump the last signature data instead of comparing (equivalent to using bitbake-dumpsig)', + action='store_true') parser.add_argument("-t", "--task", - help="find the signature data files for the last two runs of the specified task and compare them", - action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname')) + help="find the signature data files for the last two runs of the specified task and compare them", + action="store", dest="taskargs", nargs=2, metavar=('recipename', 'taskname')) parser.add_argument("-s", "--signature", - help="With -t/--task, specify the signatures to look for instead of taking the last two", - action="store", dest="sigargs", nargs=2, metavar=('fromsig', 'tosig')) + help="With -t/--task, specify the signatures to look for instead of taking the last two", + action="store", dest="sigargs", nargs=2, metavar=('fromsig', 'tosig')) parser.add_argument("sigdatafile1", - help="First signature file to compare (or signature file to dump, if second not specified). Not used when using -t/--task.", - action="store", nargs='?') + help="First signature file to compare (or signature file to dump, if second not specified). Not used when using -t/--task.", + action="store", nargs='?') parser.add_argument("sigdatafile2", - help="Second signature file to compare", - action="store", nargs='?') + help="Second signature file to compare", + action="store", nargs='?') options = parser.parse_args() if is_dump: @@ -157,7 +160,8 @@ if options.taskargs: with bb.tinfoil.Tinfoil() as tinfoil: tinfoil.prepare(config_only=True) if not options.dump and options.sigargs: - files = find_siginfo_task(tinfoil, options.taskargs[0], options.taskargs[1], options.sigargs[0], options.sigargs[1]) + files = find_siginfo_task(tinfoil, options.taskargs[0], options.taskargs[1], options.sigargs[0], + options.sigargs[1]) else: files = find_siginfo_task(tinfoil, options.taskargs[0], options.taskargs[1]) @@ -166,7 +170,8 @@ if options.taskargs: output = bb.siggen.dump_sigfile(files[-1]) else: if len(files) < 2: - logger.error('Only one matching sigdata file found for the specified task (%s %s)' % (options.taskargs[0], options.taskargs[1])) + logger.error('Only one matching sigdata file found for the specified task (%s %s)' % ( + options.taskargs[0], options.taskargs[1])) sys.exit(1) # Recurse into signature comparison -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bitbake][kirkstone][2.0][PATCH 4/4] bitbake-diffsigs: break on first dependent task difference 2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman ` (2 preceding siblings ...) 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 3/4] bitbake-diffsigs: Make PEP8 compliant Steve Sakoman @ 2023-02-14 15:28 ` Steve Sakoman 3 siblings, 0 replies; 6+ messages in thread From: Steve Sakoman @ 2023-02-14 15:28 UTC (permalink / raw) To: bitbake-devel From: "Schmidt, Adriaan" <adriaan.schmidt@siemens.com> compare_sigfiles() recursively calculates differences on all dependent tasks with changed hashes. This is done in arbitrary/alphabetical order, and only the last of those results is returned, while everything else is discarded. This changes the behavior to instead return the first difference and not calculate any more, which significantly speeds up diffs of tasks with many dependencies. Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> (cherry picked from commit ea6a676c9aa2864c2eff40eea41ba09ce903a651) Signed-off-by: Steve Sakoman <steve@sakoman.com> --- lib/bb/siggen.py | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/bb/siggen.py b/lib/bb/siggen.py index cea3a538..0a9ce0ed 100644 --- a/lib/bb/siggen.py +++ b/lib/bb/siggen.py @@ -1028,6 +1028,7 @@ def compare_sigfiles(a, b, recursecb=None, color=False, collapsed=False): # If a dependent hash changed, might as well print the line above and then defer to the changes in # that hash since in all likelyhood, they're the same changes this task also saw. output = [output[-1]] + recout + break a_taint = a_data.get('taint', None) b_taint = b_data.get('taint', None) -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [bitbake][kirkstone][2.0][PATCH 0/4] Patch review @ 2024-10-05 13:44 Steve Sakoman 0 siblings, 0 replies; 6+ messages in thread From: Steve Sakoman @ 2024-10-05 13:44 UTC (permalink / raw) To: bitbake-devel Please review this set of changes for 2.0/kirkstone and have comments back by end of day Tuesday, October 8 Passed a-full on autobuilder: https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/7372 The following changes since commit ec2a99a077da9aa0e99e8b05e0c65dcbd45864b1: data_smart: Improve performance for VariableHistory (2024-08-08 09:19:30 -0700) are available in the Git repository at: https://git.openembedded.org/bitbake-contrib stable/2.0-nut https://git.openembedded.org/bitbake-contrib/log/?h=stable/2.0-nut Richard Purdie (2): tests/fetch: Tweak to work on Fedora40 fetch/wget: Move files into place atomically Rob Woolley (1): wget: Make wget --passive-ftp option conditional on ftp/ftps Rudolf J Streif (1): fetch2/wget: Canonicalize DL_DIR paths for wget2 compatibility lib/bb/fetch2/wget.py | 21 ++++++++++++++------- lib/bb/tests/fetch.py | 6 +++--- 2 files changed, 17 insertions(+), 10 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-10-05 13:45 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-02-14 15:28 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 1/4] siggen: Fix inefficient string concatenation Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 2/4] utils/ply: Update md5 to better report errors with hashlib Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 3/4] bitbake-diffsigs: Make PEP8 compliant Steve Sakoman 2023-02-14 15:28 ` [bitbake][kirkstone][2.0][PATCH 4/4] bitbake-diffsigs: break on first dependent task difference Steve Sakoman -- strict thread matches above, loose matches on Subject: below -- 2024-10-05 13:44 [bitbake][kirkstone][2.0][PATCH 0/4] Patch review Steve Sakoman
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.