* [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing
@ 2019-08-06 18:12 Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 2/6] autobuild-run: make prepare_build() clean the output directory used for reproducibility testing Atharva Lele
` (5 more replies)
0 siblings, 6 replies; 14+ messages in thread
From: Atharva Lele @ 2019-08-06 18:12 UTC (permalink / raw)
To: buildroot
Having different output directories increases variation in the build.
This commit also changes the directory name of the single output directory to
"output-1", as having varied length names for output directories introduces a
lot of reproducibility issues.
Official Debian efforts also currently use same length names for different
directories: "/build/1st/$pkg-$ver" and "/build/$pkg-$ver/2nd"
Cfr: https://tests.reproducible-builds.org/debian/index_variations.html
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index 2afa865..45b7d2b 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -275,7 +275,10 @@ class Builder:
# path to the output directory here is not relative to the
# Buildroot sources, but to the location of the autobuilder
# script.
- self.outputdir = os.path.abspath(os.path.join(self.idir, "output"))
+ self.outputdir = os.path.abspath(os.path.join(self.idir, "output-1"))
+ # We use a different output directory for more variance in reproducibility
+ # testing.
+ self.outputdir_2 = os.path.abspath(os.path.join(self.idir, "output-2"))
self.resultdir = os.path.join(self.outputdir, "results")
# If it doesn't exist, create the instance directory
@@ -434,8 +437,8 @@ class Builder:
reproducible_results = os.path.join(self.outputdir, "results", "reproducible_results")
# Using only tar images for now
- build_1_image = os.path.join(self.outputdir, "images-1", "rootfs.tar")
- build_2_image = os.path.join(self.outputdir, "images", "rootfs.tar")
+ build_1_image = os.path.join(self.outputdir, "images", "rootfs.tar")
+ build_2_image = os.path.join(self.outputdir_2, "images", "rootfs.tar")
with open(reproducible_results, 'w') as diff:
if self.sysinfo.has("diffoscope"):
@@ -522,16 +525,17 @@ class Builder:
log_write(self.log, "INFO: build 1 failed, skipping build 2")
return ret
- # First build has been built, move files and start build 2
- os.rename(os.path.join(self.outputdir, "images"), os.path.join(self.outputdir, "images-1"))
+ # Create the second output directory if it does not exist
+ if not os.path.exists(self.outputdir_2):
+ os.mkdir(self.outputdir_2)
- # Clean up build 1
- f = open(os.path.join(self.outputdir, "logfile"), "w+")
- subprocess.call(["make", "O=%s" % self.outputdir, "-C", self.srcdir, "clean"], stdout=f, stderr=f)
+ # Copy .config to the other output directory
+ shutil.copyfile(os.path.join(self.outputdir, ".config"),
+ os.path.join(self.outputdir_2, ".config"))
# Start the second build
log_write(self.log, "INFO: Reproducible Build Test, starting build 2")
- ret = self.do_build(self.outputdir)
+ ret = self.do_build(self.outputdir_2)
if ret != 0:
log_write(self.log, "INFO: build 2 failed")
return ret
--
2.22.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 2/6] autobuild-run: make prepare_build() clean the output directory used for reproducibility testing
2019-08-06 18:12 [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Atharva Lele
@ 2019-08-06 18:12 ` Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 3/6] autobuild-run: fix cross tools prefix for diffoscope Atharva Lele
` (4 subsequent siblings)
5 siblings, 0 replies; 14+ messages in thread
From: Atharva Lele @ 2019-08-06 18:12 UTC (permalink / raw)
To: buildroot
The previous patch introduced a new output directory for the second build in a
reproducibile build test. This patch will make sure that it is cleaned before
the subsequent build process begins.
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index 45b7d2b..ab49dcb 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -383,6 +383,10 @@ class Builder:
# shutil.rmtree doesn't remove write-protected files
subprocess.call(["rm", "-rf", self.outputdir])
os.mkdir(self.outputdir)
+
+ # If it exists, remove the other output directory used for reproducibility testing
+ if os.path.exists(self.outputdir_2):
+ subprocess.call(["rm", "-rf", self.outputdir_2])
with open(os.path.join(self.outputdir, "branch"), "w") as branchf:
branchf.write(branch)
--
2.22.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 3/6] autobuild-run: fix cross tools prefix for diffoscope
2019-08-06 18:12 [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 2/6] autobuild-run: make prepare_build() clean the output directory used for reproducibility testing Atharva Lele
@ 2019-08-06 18:12 ` Atharva Lele
2019-08-06 20:18 ` Thomas Petazzoni
2019-08-06 18:12 ` [Buildroot] [PATCH 4/6] autobuild-run: make diffoscope output to a JSON-formatted file Atharva Lele
` (3 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Atharva Lele @ 2019-08-06 18:12 UTC (permalink / raw)
To: buildroot
Make outputs "make: entering directory" when called from a different
directory which is unnecessary and yields a wrong prefix. Adding this
argument supresses unnecessary output and yields the right prefix.
Before and after: https://gitlab.com/snippets/1877119
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index ab49dcb..69766b2 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -447,7 +447,8 @@ class Builder:
with open(reproducible_results, 'w') as diff:
if self.sysinfo.has("diffoscope"):
# Prefix to point diffoscope towards cross-tools
- prefix = subprocess.check_output(["make", "O=%s" % self.outputdir, "-C", self.srcdir, "printvars", "VARS=TARGET_CROSS"])
+ prefix = subprocess.check_output(["make", "--no-print-directory", "O=%s" % self.outputdir,
+ "-C", self.srcdir, "printvars", "VARS=TARGET_CROSS"])
# Remove TARGET_CROSS= and \n from the string
prefix = prefix[13:-1]
log_write(self.log, "INFO: running diffoscope on images")
--
2.22.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 4/6] autobuild-run: make diffoscope output to a JSON-formatted file
2019-08-06 18:12 [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 2/6] autobuild-run: make prepare_build() clean the output directory used for reproducibility testing Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 3/6] autobuild-run: fix cross tools prefix for diffoscope Atharva Lele
@ 2019-08-06 18:12 ` Atharva Lele
2019-08-06 19:40 ` Thomas Petazzoni
2019-08-06 18:12 ` [Buildroot] [PATCH 5/6] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
` (2 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Atharva Lele @ 2019-08-06 18:12 UTC (permalink / raw)
To: buildroot
Normal diffoscope output is readable by humans but not really
convenient when working with it in code. JSON can be easily
read and extracted information from.
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index 69766b2..520cfe2 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -453,7 +453,7 @@ class Builder:
prefix = prefix[13:-1]
log_write(self.log, "INFO: running diffoscope on images")
subprocess.call(["diffoscope", build_1_image, build_2_image,
- "--tool-prefix-binutils", prefix], stdout=diff, stderr=self.log)
+ "--tool-prefix-binutils", prefix, "--json", "-"], stdout=diff, stderr=self.log)
else:
log_write(self.log, "INFO: diffoscope not installed, falling back to cmp")
subprocess.call(["cmp", "-b", build_1_image, build_2_image], stdout=diff, stderr=self.log)
--
2.22.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 5/6] autobuild-run: initial implementation of get_reproducibility_failure_reason()
2019-08-06 18:12 [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Atharva Lele
` (2 preceding siblings ...)
2019-08-06 18:12 ` [Buildroot] [PATCH 4/6] autobuild-run: make diffoscope output to a JSON-formatted file Atharva Lele
@ 2019-08-06 18:12 ` Atharva Lele
2019-08-06 20:21 ` Thomas Petazzoni
2019-08-06 18:12 ` [Buildroot] [PATCH 6/6] autobuild-run: account for reproducibility failures in get_failure_reason() Atharva Lele
2019-08-06 19:37 ` [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Thomas Petazzoni
5 siblings, 1 reply; 14+ messages in thread
From: Atharva Lele @ 2019-08-06 18:12 UTC (permalink / raw)
To: buildroot
Analyze the JSON formatted output from diffoscope and check if
the differences are due to a filesystem reproducibility issue
or a package reproducibility issue.
Also, discard the deltas because they might take up too much space.
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 71 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 71 insertions(+)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index 520cfe2..f2f6460 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -131,6 +131,7 @@ import csv
import docopt
import errno
import hashlib
+import json
import mmap
import multiprocessing
import os
@@ -596,6 +597,76 @@ class Builder:
if reject_results():
return
+ def get_reproducibility_failure_reason(reproducible_results):
+ def clean_delta(delta):
+ added = []
+ deleted = []
+ for line in delta:
+ if line.startswith("+"):
+ added.append(line)
+ if line.startswith("-"):
+ deleted.append(line)
+ return added, deleted
+
+ with open(reproducible_results, "r") as reproduciblef:
+ json_data = json.load(reproduciblef)
+
+ packages_file_list = os.path.join(self.outputdir, "build", "packages-file-list.txt")
+
+ if json_data["unified_diff"] == None:
+ if json_data["details"][0]["source1"] == "file list":
+ json_data["details"].pop(0)
+
+ for i in range(0, len(json_data["details"])):
+ diff_source = json_data["details"][i]["source1"]
+ with open(packages_file_list, "r") as packagef:
+ for line in packagef:
+ if diff_source in line:
+ package = line.split(',')[0]
+
+ # Get package version
+ package_info = json.loads(subprocess.check_output(["make", "--no-print-directory",
+ "O=%s" % self.outputdir,
+ "-C", self.srcdir,
+ "%s-show-info" % package]))
+ if "version" in package_info[package]:
+ version = package_info[package]["version"]
+ reason = [package, version]
+ else:
+ reason = [package]
+ json_data["details"][i]["package"] = reason
+ json_data["details"][i].pop("source2")
+ if json_data["details"][i]["unified_diff"] == None:
+ json_data["details"][i].pop("unified_diff")
+ for j in range(0, len(json_data["details"][i]["details"])):
+ delta = json_data["details"][i]["details"][j]["unified_diff"].split("\n")
+ deltas = clean_delta(delta)
+ json_data["details"][i]["details"][j]["added"] = deltas[0][:100]
+ json_data["details"][i]["details"][j]["deleted"] = deltas[1][:100]
+ try:
+ json_data["details"][i]["details"][j].pop("unified_diff")
+ json_data["details"][i]["details"][j].pop("source2")
+ except KeyError as e:
+ log_write(self.log, "KeyError: %s not found in JSON details[%d][%d]" % (e, i, j))
+ else:
+ delta = json_data["details"][i]["unified_diff"]
+ deltas = clean_delta(delta)
+ json_data["details"][i]["added"] = deltas[0][:100]
+ json_data["details"][i]["deleted"] = deltas[1][:100]
+ try:
+ json_data["details"][i].pop("unified_diff")
+ json_data["details"][i].pop("source2")
+ except KeyError as e:
+ log_write(self.log, "KeyError: %s not found in JSON details[%d]" % (e, i))
+ reason = json_data["details"][0]["package"]
+ else:
+ reason = "filesystem"
+
+ with open(reproducible_results, "w") as reproduciblef:
+ json.dump(json_data, reproduciblef, sort_keys=True, indent=4)
+
+ return reason
+
def get_failure_reason():
# Output is a tuple (package, version), or None.
lastlines = decode_bytes(subprocess.Popen(
--
2.22.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 6/6] autobuild-run: account for reproducibility failures in get_failure_reason()
2019-08-06 18:12 [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Atharva Lele
` (3 preceding siblings ...)
2019-08-06 18:12 ` [Buildroot] [PATCH 5/6] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
@ 2019-08-06 18:12 ` Atharva Lele
2019-08-06 19:37 ` [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Thomas Petazzoni
5 siblings, 0 replies; 14+ messages in thread
From: Atharva Lele @ 2019-08-06 18:12 UTC (permalink / raw)
To: buildroot
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 29 ++++++++++++++++++++---------
1 file changed, 20 insertions(+), 9 deletions(-)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index f2f6460..1f12355 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -669,15 +669,26 @@ class Builder:
def get_failure_reason():
# Output is a tuple (package, version), or None.
- lastlines = decode_bytes(subprocess.Popen(
- ["tail", "-n", "3", os.path.join(self.outputdir, "logfile")],
- stdout=subprocess.PIPE).communicate()[0]).splitlines()
-
- regexp = re.compile(r'make: \*\*\* .*/(?:build|toolchain)/([^/]*)/')
- for line in lastlines:
- m = regexp.search(line)
- if m:
- return m.group(1).rsplit('-', 1)
+ # Output is "package-reproducible" in case of reproducibility failure.
+
+ reproducible_results = os.path.join(self.resultdir, "reproducible_results")
+ if os.path.exists(reproducible_results) and os.stat(reproducible_results).st_size > 0:
+ if self.sysinfo.has("diffoscope"):
+ reason = get_reproducibility_failure_reason(reproducible_results)
+ reason.append("nonreproducible")
+ return reason
+ else:
+ return ["nonreproducible"]
+ else:
+ lastlines = decode_bytes(subprocess.Popen(
+ ["tail", "-n", "3", os.path.join(self.outputdir, "logfile")],
+ stdout=subprocess.PIPE).communicate()[0]).splitlines()
+
+ regexp = re.compile(r'make: \*\*\* .*/(?:build|toolchain)/([^/]*)/')
+ for line in lastlines:
+ m = regexp.search(line)
+ if m:
+ return m.group(1).rsplit('-', 1)
# not found
return None
--
2.22.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing
2019-08-06 18:12 [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Atharva Lele
` (4 preceding siblings ...)
2019-08-06 18:12 ` [Buildroot] [PATCH 6/6] autobuild-run: account for reproducibility failures in get_failure_reason() Atharva Lele
@ 2019-08-06 19:37 ` Thomas Petazzoni
2019-08-07 6:02 ` Atharva Lele
5 siblings, 1 reply; 14+ messages in thread
From: Thomas Petazzoni @ 2019-08-06 19:37 UTC (permalink / raw)
To: buildroot
Hello,
On Tue, 6 Aug 2019 23:42:46 +0530
Atharva Lele <itsatharva@gmail.com> wrote:
> Having different output directories increases variation in the build.
>
> This commit also changes the directory name of the single output directory to
> "output-1", as having varied length names for output directories introduces a
> lot of reproducibility issues.
>
> Official Debian efforts also currently use same length names for different
> directories: "/build/1st/$pkg-$ver" and "/build/$pkg-$ver/2nd"
> Cfr: https://tests.reproducible-builds.org/debian/index_variations.html
Is this an intermediate step, with the long term goal of supporting
output directories of different length, or is this a limitation that we
will have "forever" ?
Thanks!
Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 4/6] autobuild-run: make diffoscope output to a JSON-formatted file
2019-08-06 18:12 ` [Buildroot] [PATCH 4/6] autobuild-run: make diffoscope output to a JSON-formatted file Atharva Lele
@ 2019-08-06 19:40 ` Thomas Petazzoni
2019-08-07 6:12 ` Atharva Lele
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Petazzoni @ 2019-08-06 19:40 UTC (permalink / raw)
To: buildroot
Hello Atharva,
On Tue, 6 Aug 2019 23:42:49 +0530
Atharva Lele <itsatharva@gmail.com> wrote:
> Normal diffoscope output is readable by humans but not really
> convenient when working with it in code. JSON can be easily
> read and extracted information from.
>
> Signed-off-by: Atharva Lele <itsatharva@gmail.com>
Peter and I had a look at the diffoscope output, and while the JSON
version is definitely good for human parsing, having the text version
in addition for humans to read would also be nice.
Could you instead store both the JSON and text versions ?
Just run this:
$ diffoscope --text differences.txt --json differences.json rootfs.tar rootfs.tar.before
And you'll get both the JSON and text version in files.
Best regards,
Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 3/6] autobuild-run: fix cross tools prefix for diffoscope
2019-08-06 18:12 ` [Buildroot] [PATCH 3/6] autobuild-run: fix cross tools prefix for diffoscope Atharva Lele
@ 2019-08-06 20:18 ` Thomas Petazzoni
0 siblings, 0 replies; 14+ messages in thread
From: Thomas Petazzoni @ 2019-08-06 20:18 UTC (permalink / raw)
To: buildroot
On Tue, 6 Aug 2019 23:42:48 +0530
Atharva Lele <itsatharva@gmail.com> wrote:
> Make outputs "make: entering directory" when called from a different
> directory which is unnecessary and yields a wrong prefix. Adding this
> argument supresses unnecessary output and yields the right prefix.
>
> Before and after: https://gitlab.com/snippets/1877119
>
> Signed-off-by: Atharva Lele <itsatharva@gmail.com>
> ---
> scripts/autobuild-run | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
Applied to buildroot-test, thanks!
Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 5/6] autobuild-run: initial implementation of get_reproducibility_failure_reason()
2019-08-06 18:12 ` [Buildroot] [PATCH 5/6] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
@ 2019-08-06 20:21 ` Thomas Petazzoni
2019-08-07 6:07 ` Atharva Lele
0 siblings, 1 reply; 14+ messages in thread
From: Thomas Petazzoni @ 2019-08-06 20:21 UTC (permalink / raw)
To: buildroot
Hello Atharva,
On Tue, 6 Aug 2019 23:42:50 +0530
Atharva Lele <itsatharva@gmail.com> wrote:
> + if json_data["unified_diff"] == None:
> + if json_data["details"][0]["source1"] == "file list":
> + json_data["details"].pop(0)
> +
> + for i in range(0, len(json_data["details"])):
> + diff_source = json_data["details"][i]["source1"]
> + with open(packages_file_list, "r") as packagef:
> + for line in packagef:
> + if diff_source in line:
> + package = line.split(',')[0]
> +
> + # Get package version
> + package_info = json.loads(subprocess.check_output(["make", "--no-print-directory",
> + "O=%s" % self.outputdir,
> + "-C", self.srcdir,
> + "%s-show-info" % package]))
> + if "version" in package_info[package]:
> + version = package_info[package]["version"]
> + reason = [package, version]
> + else:
> + reason = [package]
> + json_data["details"][i]["package"] = reason
> + json_data["details"][i].pop("source2")
> + if json_data["details"][i]["unified_diff"] == None:
> + json_data["details"][i].pop("unified_diff")
> + for j in range(0, len(json_data["details"][i]["details"])):
> + delta = json_data["details"][i]["details"][j]["unified_diff"].split("\n")
> + deltas = clean_delta(delta)
> + json_data["details"][i]["details"][j]["added"] = deltas[0][:100]
> + json_data["details"][i]["details"][j]["deleted"] = deltas[1][:100]
> + try:
> + json_data["details"][i]["details"][j].pop("unified_diff")
> + json_data["details"][i]["details"][j].pop("source2")
> + except KeyError as e:
> + log_write(self.log, "KeyError: %s not found in JSON details[%d][%d]" % (e, i, j))
> + else:
> + delta = json_data["details"][i]["unified_diff"]
> + deltas = clean_delta(delta)
> + json_data["details"][i]["added"] = deltas[0][:100]
> + json_data["details"][i]["deleted"] = deltas[1][:100]
> + try:
> + json_data["details"][i].pop("unified_diff")
> + json_data["details"][i].pop("source2")
> + except KeyError as e:
> + log_write(self.log, "KeyError: %s not found in JSON details[%d]" % (e, i))
> + reason = json_data["details"][0]["package"]
Wow. This is way too complicated, and needs to be split into
sub-functions, and local variables should be used to make this clearer.
Perhaps a few comments in the code would also help.
Also, I believe doing loops with range(0, len(a list)) is not very
pythonic, a more pythonic way is:
for item in a_list:
... do something with item ...
But clearly, having sub-functions each doing one part of the job would
help a lot.
Thanks,
Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing
2019-08-06 19:37 ` [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Thomas Petazzoni
@ 2019-08-07 6:02 ` Atharva Lele
2019-08-07 8:33 ` Thomas Petazzoni
0 siblings, 1 reply; 14+ messages in thread
From: Atharva Lele @ 2019-08-07 6:02 UTC (permalink / raw)
To: buildroot
On Wednesday, August 7, 2019 1:07:24 AM IST Thomas Petazzoni wrote:
> Hello,
>
> On Tue, 6 Aug 2019 23:42:46 +0530
>
> Atharva Lele <itsatharva@gmail.com> wrote:
> > Having different output directories increases variation in the build.
> >
> > This commit also changes the directory name of the single output directory
> > to "output-1", as having varied length names for output directories
> > introduces a lot of reproducibility issues.
> >
> > Official Debian efforts also currently use same length names for different
> > directories: "/build/1st/$pkg-$ver" and "/build/$pkg-$ver/2nd"
> > Cfr: https://tests.reproducible-builds.org/debian/index_variations.html
>
> Is this an intermediate step, with the long term goal of supporting
> output directories of different length, or is this a limitation that we
> will have "forever" ?
Reproducible-Builds has proposed a spec called BUILD_PATH_PREFIX_MAP to help
with this. However, that is still in the works and not fully solved yet. (As
told to Yann and I on the #reproducible-builds IRC)
See: https://wiki.debian.org/ReproducibleBuilds/GCC-build-path and https://
reproducible-builds.org/specs/build-path-prefix-map/
So I think until further work is done on BUILD_PATH_PREFIX_MAP, we will have
to stay with output directories of same length.
> Thanks!
>
> Thomas
Regards,
Atharva Lele
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 5/6] autobuild-run: initial implementation of get_reproducibility_failure_reason()
2019-08-06 20:21 ` Thomas Petazzoni
@ 2019-08-07 6:07 ` Atharva Lele
0 siblings, 0 replies; 14+ messages in thread
From: Atharva Lele @ 2019-08-07 6:07 UTC (permalink / raw)
To: buildroot
On Wednesday, August 7, 2019 1:51:18 AM IST Thomas Petazzoni wrote:
> Hello Atharva,
>
> On Tue, 6 Aug 2019 23:42:50 +0530
>
> Atharva Lele <itsatharva@gmail.com> wrote:
> > + if json_data["unified_diff"] == None:
> > + if json_data["details"][0]["source1"] == "file list":
> > + json_data["details"].pop(0)
> > +
> > + for i in range(0, len(json_data["details"])):
> > + diff_source = json_data["details"][i]["source1"]
> > + with open(packages_file_list, "r") as packagef:
> > + for line in packagef:
> > + if diff_source in line:
> > + package = line.split(',')[0]
> > +
> > + # Get package version
> > + package_info =
> > json.loads(subprocess.check_output(["make", "--no-print-directory", +
> > "O=%s"
> > % self.outputdir, +
> > "-C", self.srcdir, +
> > "%s-show-info" % package])) +
> > if "version" in package_info[package]:
> > + version = package_info[package]["version"]
> > + reason = [package, version]
> > + else:
> > + reason = [package]
> > + json_data["details"][i]["package"] = reason
> > + json_data["details"][i].pop("source2")
> > + if json_data["details"][i]["unified_diff"] == None:
> > + json_data["details"][i].pop("unified_diff")
> > + for j in range(0,
> > len(json_data["details"][i]["details"])): +
> > delta = json_data["details"][i]["details"][j]["unified_diff"].split("\n")
> > + deltas = clean_delta(delta)
> > +
> > json_data["details"][i]["details"][j]["added"] = deltas[0][:100] +
> > json_data["details"][i]["details"][j]["deleted"] =
> > deltas[1][:100] + try:
> > +
> > json_data["details"][i]["details"][j].pop("unified_diff") +
> > json_data["details"][i]["details"][j].pop("source2") +
> > except KeyError as e:
> > + log_write(self.log, "KeyError: %s not
> > found in JSON details[%d][%d]" % (e, i, j)) + else:
> > + delta = json_data["details"][i]["unified_diff"]
> > + deltas = clean_delta(delta)
> > + json_data["details"][i]["added"] =
> > deltas[0][:100]
> > + json_data["details"][i]["deleted"] =
> > deltas[1][:100] + try:
> > + json_data["details"][i].pop("unified_diff")
> > + json_data["details"][i].pop("source2")
> > + except KeyError as e:
> > + log_write(self.log, "KeyError: %s not found
> > in JSON details[%d]" % (e, i)) + reason =
> > json_data["details"][0]["package"]
>
> Wow. This is way too complicated, and needs to be split into
> sub-functions, and local variables should be used to make this clearer.
> Perhaps a few comments in the code would also help.
>
Noted.
> Also, I believe doing loops with range(0, len(a list)) is not very
> pythonic, a more pythonic way is:
>
> for item in a_list:
> ... do something with item ...
>
Noted.
> But clearly, having sub-functions each doing one part of the job would
> help a lot.
>
> Thanks,
>
> Thomas
I'll refactor it with sub-functions and the other suggestions and resend the
patches.
Thanks for the review!
Regards,
Atharva Lele
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 4/6] autobuild-run: make diffoscope output to a JSON-formatted file
2019-08-06 19:40 ` Thomas Petazzoni
@ 2019-08-07 6:12 ` Atharva Lele
0 siblings, 0 replies; 14+ messages in thread
From: Atharva Lele @ 2019-08-07 6:12 UTC (permalink / raw)
To: buildroot
On Wednesday, August 7, 2019 1:10:14 AM IST Thomas Petazzoni wrote:
> Hello Atharva,
>
> On Tue, 6 Aug 2019 23:42:49 +0530
>
> Atharva Lele <itsatharva@gmail.com> wrote:
> > Normal diffoscope output is readable by humans but not really
> > convenient when working with it in code. JSON can be easily
> > read and extracted information from.
> >
> > Signed-off-by: Atharva Lele <itsatharva@gmail.com>
>
> Peter and I had a look at the diffoscope output, and while the JSON
> version is definitely good for human parsing, having the text version
> in addition for humans to read would also be nice.
>
> Could you instead store both the JSON and text versions ?
>
Right yes, Arnout and I discussed this in the meeting yesterday. We also
decided to limit the size of the text version's output to about 40KB to save
on server space.
I'll resend with the changes. Thanks for the review!
> Best regards,
>
> Thomas
Regards,
Atharva Lele
^ permalink raw reply [flat|nested] 14+ messages in thread
* [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing
2019-08-07 6:02 ` Atharva Lele
@ 2019-08-07 8:33 ` Thomas Petazzoni
0 siblings, 0 replies; 14+ messages in thread
From: Thomas Petazzoni @ 2019-08-07 8:33 UTC (permalink / raw)
To: buildroot
Hello,
On Wed, 07 Aug 2019 11:32:52 +0530
Atharva Lele <itsatharva@gmail.com> wrote:
> > Is this an intermediate step, with the long term goal of supporting
> > output directories of different length, or is this a limitation that we
> > will have "forever" ?
>
> Reproducible-Builds has proposed a spec called BUILD_PATH_PREFIX_MAP to help
> with this. However, that is still in the works and not fully solved yet. (As
> told to Yann and I on the #reproducible-builds IRC)
>
> See: https://wiki.debian.org/ReproducibleBuilds/GCC-build-path and https://
> reproducible-builds.org/specs/build-path-prefix-map/
>
> So I think until further work is done on BUILD_PATH_PREFIX_MAP, we will have
> to stay with output directories of same length.
OK, thanks for the additional explanation. These details would have
been useful to have in the commit log I believe.
Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2019-08-07 8:33 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-06 18:12 [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 2/6] autobuild-run: make prepare_build() clean the output directory used for reproducibility testing Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 3/6] autobuild-run: fix cross tools prefix for diffoscope Atharva Lele
2019-08-06 20:18 ` Thomas Petazzoni
2019-08-06 18:12 ` [Buildroot] [PATCH 4/6] autobuild-run: make diffoscope output to a JSON-formatted file Atharva Lele
2019-08-06 19:40 ` Thomas Petazzoni
2019-08-07 6:12 ` Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 5/6] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
2019-08-06 20:21 ` Thomas Petazzoni
2019-08-07 6:07 ` Atharva Lele
2019-08-06 18:12 ` [Buildroot] [PATCH 6/6] autobuild-run: account for reproducibility failures in get_failure_reason() Atharva Lele
2019-08-06 19:37 ` [Buildroot] [PATCH 1/6] autobuild-run: use different output directories for reproducible builds testing Thomas Petazzoni
2019-08-07 6:02 ` Atharva Lele
2019-08-07 8:33 ` Thomas Petazzoni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox