From: Atharva Lele <itsatharva@gmail.com>
To: buildroot@busybox.net
Subject: [Buildroot] [PATCH v4 5/5] autobuild-run: initial implementation of categorization() of nonreproducibility
Date: Tue, 20 Aug 2019 20:22:31 +0530 [thread overview]
Message-ID: <20190820145231.15507-5-itsatharva@gmail.com> (raw)
In-Reply-To: <20190820145231.15507-1-itsatharva@gmail.com>
Build ID and Build Path reproducibility issues are easy to identify and thus we
start categorization with these issues.
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index c25413b..83acaad 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -131,6 +131,7 @@ import csv
import docopt
import errno
import hashlib
+from itertools import izip
import json
import mmap
import multiprocessing
@@ -641,6 +642,26 @@ class Builder:
if "source2" in l:
l.pop("source2")
+ def categorize(added, deleted):
+ # In some deltas, the only part of output directory is captured.
+ # For eg. "put-1" or "tput-2", thus we must check all such possibilities.
+ # Start with 3 letter combinations to avoid false positives.
+ path_1 = "output-1"
+ path_2 = "output-2"
+ paths = [path_1[i:j] for i in range(len(path_1)) for j in range(i+3, len(path_1)+1)]
+ paths_2 = [path_2[i:j] for i in range(len(path_1)) for j in range(i+3, len(path_1)+1)]
+ paths = paths + paths_2
+ # We need to iterate over the deltas simultaneously.
+ for a, d in izip(added, deleted):
+ for p in paths:
+ if p in a or p in d:
+ return "Embedded Path"
+ if "Build ID" in a or "Build ID" in d:
+ return "Build ID variation"
+ else:
+ continue
+ return "not found"
+
packages_file_list = os.path.join(self.outputdir, "build", "packages-file-list.txt")
with open(reproducible_results, "r") as reproduciblef:
@@ -667,12 +688,18 @@ class Builder:
item_details["added"] = split_deltas[0][:100]
item_details["deleted"] = split_deltas[1][:100]
cleanup(item_details)
+ category = categorize(item_details["added"], item_details["deleted"])
+ if category is not "not found":
+ item["category"] = category
+ break
else:
diff = item["unified_diff"].split("\n")
split_deltas = split_delta(diff)
item["added"] = split_deltas[0][:100]
item["deleted"] = split_deltas[1][:100]
cleanup(item)
+ if "added" in item or "deleted" in item:
+ item["category"] = categorize(item["added"], item["deleted"])
# We currently just set the reason from first non-reproducible package in the
# dictionary.
reason = json_data["details"][0]["package"]
--
2.22.0
next prev parent reply other threads:[~2019-08-20 14:52 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-20 14:52 [Buildroot] [PATCH v4 1/5] autobuild-run: check if reproducibile_results exists before checking its size Atharva Lele
2019-08-20 14:52 ` [Buildroot] [PATCH v4 2/5] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
2019-09-08 17:06 ` Arnout Vandecappelle
2019-09-08 22:42 ` Thomas Petazzoni
2019-09-09 7:35 ` Arnout Vandecappelle
2019-09-09 7:45 ` Thomas Petazzoni
2019-09-12 12:47 ` Atharva Lele
2019-09-14 17:27 ` Arnout Vandecappelle
2019-08-20 14:52 ` [Buildroot] [PATCH v4 3/5] autobuild-run: account for reproducibility failures in get_failure_reason() Atharva Lele
2019-09-08 17:13 ` Arnout Vandecappelle
2019-09-12 12:59 ` Atharva Lele
2019-09-14 17:33 ` Arnout Vandecappelle
2019-08-20 14:52 ` [Buildroot] [PATCH v4 4/5] autobuild-run: move with open to appropriate place in check_reproducibility() Atharva Lele
2019-08-20 14:52 ` Atharva Lele [this message]
2019-09-08 16:43 ` [Buildroot] [PATCH v4 1/5] autobuild-run: check if reproducibile_results exists before checking its size Arnout Vandecappelle
2019-09-12 12:00 ` Atharva Lele
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190820145231.15507-5-itsatharva@gmail.com \
--to=itsatharva@gmail.com \
--cc=buildroot@busybox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox