* [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command
@ 2019-08-13 4:12 Atharva Lele
2019-08-13 4:12 ` [Buildroot] [PATCH v3 2/3] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Atharva Lele @ 2019-08-13 4:12 UTC (permalink / raw)
To: buildroot
Since we are outputting to two file types, we can't just redirect stdout.
Instead, we use diffoscope's ability to output to a file to maintain
consistency in the command.
This patch also fixes a typo in the command. For text output,
it should be reproducibility_results_txt instead of
reproducibility_results_text.
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index dbb497c..ead81a0 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -454,9 +454,9 @@ class Builder:
prefix = prefix[13:-1]
log_write(self.log, "INFO: running diffoscope on images")
subprocess.call(["diffoscope", build_1_image, build_2_image,
- "--tool-prefix-binutils", prefix, "--json", "-",
- "--text", reproducible_results_text, "--max-text-report-size", "40000"],
- stdout=diff, stderr=self.log)
+ "--tool-prefix-binutils", prefix, "--json", reproducible_results,
+ "--text", reproducible_results_txt, "--max-text-report-size", "40000"],
+ stderr=self.log)
else:
log_write(self.log, "INFO: diffoscope not installed, falling back to cmp")
subprocess.call(["cmp", "-b", build_1_image, build_2_image], stdout=diff, stderr=self.log)
--
2.22.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Buildroot] [PATCH v3 2/3] autobuild-run: initial implementation of get_reproducibility_failure_reason()
2019-08-13 4:12 [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command Atharva Lele
@ 2019-08-13 4:12 ` Atharva Lele
2019-08-13 4:12 ` [Buildroot] [PATCH v3 3/3] autobuild-run: account for reproducibility failures in get_failure_reason() Atharva Lele
2019-08-13 14:19 ` [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command Thomas Petazzoni
2 siblings, 0 replies; 5+ messages in thread
From: Atharva Lele @ 2019-08-13 4:12 UTC (permalink / raw)
To: buildroot
Analyze the JSON formatted output from diffoscope and check if
the differences are due to a filesystem reproducibility issue
or a package reproducibility issue.
Also, discard the deltas because they might take up too much space.
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
Changes v1 -> v2:
- Refactor using subfunctions and local variables (suggested by Thomas)
- Added comments (suggested by Thomas)
- Use more pythonic loops (suggested by Thomas)
---
scripts/autobuild-run | 89 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 89 insertions(+)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index ead81a0..9b8983f 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -131,6 +131,7 @@ import csv
import docopt
import errno
import hashlib
+import json
import mmap
import multiprocessing
import os
@@ -599,6 +600,94 @@ class Builder:
if reject_results():
return
+ def get_reproducibility_failure_reason(reproducible_results):
+ def split_delta(delta):
+ # Take a delta and split it into added, deleted lines.
+ added = []
+ deleted = []
+ for line in delta:
+ if line.startswith("+"):
+ added.append(line)
+ if line.startswith("-"):
+ deleted.append(line)
+ return added, deleted
+
+ def get_package(sourcef):
+ # Returns which package the source file belongs to.
+ with open(packages_file_list, "r") as packagef:
+ for line in packagef:
+ if sourcef in line:
+ package = line.split(',')[0]
+
+ if package:
+ # Get package version
+ package_info = json.loads(subprocess.check_output(["make", "--no-print-directory",
+ "O=%s" % self.outputdir,
+ "-C", self.srcdir,
+ "%s-show-info" % package]))
+ if "version" in package_info[package]:
+ version = package_info[package]["version"]
+ return [package, version]
+ else:
+ return [package]
+ else:
+ return ["not found"]
+
+ def cleanup(l):
+ # Takes a list and removes data which is redundant (source2) or data
+ # that might take up too much space (like huge diffs).
+ if "unified_diff" in l:
+ l.pop("unified_diff")
+ if "source2" in l:
+ l.pop("source2")
+
+
+ packages_file_list = os.path.join(self.outputdir, "build", "packages-file-list.txt")
+
+ with open(reproducible_results, "r") as reproduciblef:
+ json_data = json.load(reproduciblef)
+
+ if json_data["unified_diff"] == None:
+ # Remove the file list because it is not useful, i.e. it only shows
+ # which files vary, and nothing more.
+ if json_data["details"][0]["source1"] == "file list":
+ json_data["details"].pop(0)
+
+ # Iterate over details in the diffoscope output.
+ for item in json_data["details"]:
+ diff_src = item["source1"]
+ item["package"] = get_package(diff_src)
+
+ # In some cases, diffoscope uses multiple commands to get various
+ # diffs. Due to this, it generates a "details" key for those files
+ # instead of just storing the diff in the "unified_diff" key.
+ if item["unified_diff"] == None:
+ for item_details in item["details"]:
+ diff = item_details["unified_diff"].split("\n")
+ split_deltas = split_delta(diff)
+ item_details["added"] = split_deltas[0][:100]
+ item_details["deleted"] = split_deltas[1][:100]
+ cleanup(item_details)
+ else:
+ diff = item["unified_diff"].split("\n")
+ split_deltas = split_delta(diff)
+ item["added"] = split_deltas[0][:100]
+ item["deleted"] = split_deltas[1][:100]
+ cleanup(item)
+ # We currently just set the reason from first non-reproducible package in the
+ # dictionary.
+ reason = json_data["details"][0]["package"]
+
+ # If there does exist a unified_diff directly for the .tar images, it is probably
+ # a filesystem reproducibility issue.
+ else:
+ reason = ["filesystem"]
+
+ with open(reproducible_results, "w") as reproduciblef:
+ json.dump(json_data, reproduciblef, sort_keys=True, indent=4)
+
+ return reason
+
def get_failure_reason():
# Output is a tuple (package, version), or None.
lastlines = decode_bytes(subprocess.Popen(
--
2.22.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Buildroot] [PATCH v3 3/3] autobuild-run: account for reproducibility failures in get_failure_reason()
2019-08-13 4:12 [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command Atharva Lele
2019-08-13 4:12 ` [Buildroot] [PATCH v3 2/3] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
@ 2019-08-13 4:12 ` Atharva Lele
2019-08-13 14:19 ` [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command Thomas Petazzoni
2 siblings, 0 replies; 5+ messages in thread
From: Atharva Lele @ 2019-08-13 4:12 UTC (permalink / raw)
To: buildroot
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
Changes v1 -> v3:
- Account for changed name of diffoscope output
---
scripts/autobuild-run | 29 ++++++++++++++++++++---------
1 file changed, 20 insertions(+), 9 deletions(-)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index 9b8983f..eba16b2 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -690,15 +690,26 @@ class Builder:
def get_failure_reason():
# Output is a tuple (package, version), or None.
- lastlines = decode_bytes(subprocess.Popen(
- ["tail", "-n", "3", os.path.join(self.outputdir, "logfile")],
- stdout=subprocess.PIPE).communicate()[0]).splitlines()
-
- regexp = re.compile(r'make: \*\*\* .*/(?:build|toolchain)/([^/]*)/')
- for line in lastlines:
- m = regexp.search(line)
- if m:
- return m.group(1).rsplit('-', 1)
+ # Output is "package-reproducible" in case of reproducibility failure.
+
+ reproducible_results = os.path.join(self.resultdir, "diffoscope-results.json")
+ if os.path.exists(reproducible_results) and os.stat(reproducible_results).st_size > 0:
+ if self.sysinfo.has("diffoscope"):
+ reason = get_reproducibility_failure_reason(reproducible_results)
+ reason.append("nonreproducible")
+ return reason
+ else:
+ return ["nonreproducible"]
+ else:
+ lastlines = decode_bytes(subprocess.Popen(
+ ["tail", "-n", "3", os.path.join(self.outputdir, "logfile")],
+ stdout=subprocess.PIPE).communicate()[0]).splitlines()
+
+ regexp = re.compile(r'make: \*\*\* .*/(?:build|toolchain)/([^/]*)/')
+ for line in lastlines:
+ m = regexp.search(line)
+ if m:
+ return m.group(1).rsplit('-', 1)
# not found
return None
--
2.22.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command
2019-08-13 4:12 [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command Atharva Lele
2019-08-13 4:12 ` [Buildroot] [PATCH v3 2/3] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
2019-08-13 4:12 ` [Buildroot] [PATCH v3 3/3] autobuild-run: account for reproducibility failures in get_failure_reason() Atharva Lele
@ 2019-08-13 14:19 ` Thomas Petazzoni
2019-08-16 13:56 ` Atharva Lele
2 siblings, 1 reply; 5+ messages in thread
From: Thomas Petazzoni @ 2019-08-13 14:19 UTC (permalink / raw)
To: buildroot
Hello Atharva,
On Tue, 13 Aug 2019 09:42:04 +0530
Atharva Lele <itsatharva@gmail.com> wrote:
> Since we are outputting to two file types, we can't just redirect stdout.
> Instead, we use diffoscope's ability to output to a file to maintain
> consistency in the command.
>
> This patch also fixes a typo in the command. For text output,
> it should be reproducibility_results_txt instead of
> reproducibility_results_text.
>
> Signed-off-by: Atharva Lele <itsatharva@gmail.com>
> ---
> scripts/autobuild-run | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
Applied to buildroot-test, thanks. However, the code running diffoscope
is still executed inside a with open(reproducible_results, 'w') as
diff, which is no longer needed, as diffoscope outputs directly to the
reproducible_results file.
You should move that this file opening inside the else condition which
is used when diffoscope is not available.
Best regards,
Thomas
--
Thomas Petazzoni, CTO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command
2019-08-13 14:19 ` [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command Thomas Petazzoni
@ 2019-08-16 13:56 ` Atharva Lele
0 siblings, 0 replies; 5+ messages in thread
From: Atharva Lele @ 2019-08-16 13:56 UTC (permalink / raw)
To: buildroot
On Tue, Aug 13, 2019 at 7:49 PM Thomas Petazzoni
<thomas.petazzoni@bootlin.com> wrote:
>
> Hello Atharva,
>
> On Tue, 13 Aug 2019 09:42:04 +0530
> Atharva Lele <itsatharva@gmail.com> wrote:
>
> > Since we are outputting to two file types, we can't just redirect stdout.
> > Instead, we use diffoscope's ability to output to a file to maintain
> > consistency in the command.
> >
> > This patch also fixes a typo in the command. For text output,
> > it should be reproducibility_results_txt instead of
> > reproducibility_results_text.
> >
> > Signed-off-by: Atharva Lele <itsatharva@gmail.com>
> > ---
> > scripts/autobuild-run | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
>
> Applied to buildroot-test, thanks. However, the code running diffoscope
> is still executed inside a with open(reproducible_results, 'w') as
> diff, which is no longer needed, as diffoscope outputs directly to the
> reproducible_results file.
>
> You should move that this file opening inside the else condition which
> is used when diffoscope is not available.
>
Oh right, slipped my mind. I'll send a patch to fix it!
Apologies for the late reply, but your email got buried between
message delivery failures since emails to Yann are bouncing.
> Best regards,
>
> Thomas
> --
> Thomas Petazzoni, CTO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
Thank you for the review!
--
Regards,
Atharva Lele
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-08-16 13:56 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-13 4:12 [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command Atharva Lele
2019-08-13 4:12 ` [Buildroot] [PATCH v3 2/3] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
2019-08-13 4:12 ` [Buildroot] [PATCH v3 3/3] autobuild-run: account for reproducibility failures in get_failure_reason() Atharva Lele
2019-08-13 14:19 ` [Buildroot] [PATCH v3 1/3] autobuild-run: maintain consistency in diffoscope command Thomas Petazzoni
2019-08-16 13:56 ` Atharva Lele
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox