From: Atharva Lele <itsatharva@gmail.com>
To: buildroot@busybox.net
Subject: [Buildroot] [PATCH v4 5/5] autobuild-run: initial implementation of categorization() of nonreproducibility
Date: Tue, 20 Aug 2019 20:22:31 +0530 [thread overview]
Message-ID: <20190820145231.15507-5-itsatharva@gmail.com> (raw)
In-Reply-To: <20190820145231.15507-1-itsatharva@gmail.com>
Build ID and Build Path reproducibility issues are easy to identify and thus we
start categorization with these issues.
Signed-off-by: Atharva Lele <itsatharva@gmail.com>
---
scripts/autobuild-run | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/scripts/autobuild-run b/scripts/autobuild-run
index c25413b..83acaad 100755
--- a/scripts/autobuild-run
+++ b/scripts/autobuild-run
@@ -131,6 +131,7 @@ import csv
import docopt
import errno
import hashlib
+from itertools import izip
import json
import mmap
import multiprocessing
@@ -641,6 +642,26 @@ class Builder:
if "source2" in l:
l.pop("source2")
+ def categorize(added, deleted):
+ # In some deltas, the only part of output directory is captured.
+ # For eg. "put-1" or "tput-2", thus we must check all such possibilities.
+ # Start with 3 letter combinations to avoid false positives.
+ path_1 = "output-1"
+ path_2 = "output-2"
+ paths = [path_1[i:j] for i in range(len(path_1)) for j in range(i+3, len(path_1)+1)]
+ paths_2 = [path_2[i:j] for i in range(len(path_1)) for j in range(i+3, len(path_1)+1)]
+ paths = paths + paths_2
+ # We need to iterate over the deltas simultaneously.
+ for a, d in izip(added, deleted):
+ for p in paths:
+ if p in a or p in d:
+ return "Embedded Path"
+ if "Build ID" in a or "Build ID" in d:
+ return "Build ID variation"
+ else:
+ continue
+ return "not found"
+
packages_file_list = os.path.join(self.outputdir, "build", "packages-file-list.txt")
with open(reproducible_results, "r") as reproduciblef:
@@ -667,12 +688,18 @@ class Builder:
item_details["added"] = split_deltas[0][:100]
item_details["deleted"] = split_deltas[1][:100]
cleanup(item_details)
+ category = categorize(item_details["added"], item_details["deleted"])
+ if category is not "not found":
+ item["category"] = category
+ break
else:
diff = item["unified_diff"].split("\n")
split_deltas = split_delta(diff)
item["added"] = split_deltas[0][:100]
item["deleted"] = split_deltas[1][:100]
cleanup(item)
+ if "added" in item or "deleted" in item:
+ item["category"] = categorize(item["added"], item["deleted"])
# We currently just set the reason from first non-reproducible package in the
# dictionary.
reason = json_data["details"][0]["package"]
--
2.22.0
next prev parent reply other threads:[~2019-08-20 14:52 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-20 14:52 [Buildroot] [PATCH v4 1/5] autobuild-run: check if reproducibile_results exists before checking its size Atharva Lele
2019-08-20 14:52 ` [Buildroot] [PATCH v4 2/5] autobuild-run: initial implementation of get_reproducibility_failure_reason() Atharva Lele
2019-09-08 17:06 ` Arnout Vandecappelle
2019-09-08 22:42 ` Thomas Petazzoni
2019-09-09 7:35 ` Arnout Vandecappelle
2019-09-09 7:45 ` Thomas Petazzoni
2019-09-12 12:47 ` Atharva Lele
2019-09-14 17:27 ` Arnout Vandecappelle
2019-08-20 14:52 ` [Buildroot] [PATCH v4 3/5] autobuild-run: account for reproducibility failures in get_failure_reason() Atharva Lele
2019-09-08 17:13 ` Arnout Vandecappelle
2019-09-12 12:59 ` Atharva Lele
2019-09-14 17:33 ` Arnout Vandecappelle
2019-08-20 14:52 ` [Buildroot] [PATCH v4 4/5] autobuild-run: move with open to appropriate place in check_reproducibility() Atharva Lele
2019-08-20 14:52 ` Atharva Lele [this message]
2019-09-08 16:43 ` [Buildroot] [PATCH v4 1/5] autobuild-run: check if reproducibile_results exists before checking its size Arnout Vandecappelle
2019-09-12 12:00 ` Atharva Lele
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190820145231.15507-5-itsatharva@gmail.com \
--to=itsatharva@gmail.com \
--cc=buildroot@busybox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.