From: Jacob Kroon <jacob.kroon@gmail.com>
To: Joshua Watt <jpewhacker@gmail.com>,
openembedded-core@lists.openembedded.org
Cc: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
Subject: Re: [PATCH] classes/sstate: Update output hash
Date: Tue, 15 Jan 2019 21:16:57 +0100 [thread overview]
Message-ID: <272c34a9-5806-eafa-7d2a-b44ef25d63cf@gmail.com> (raw)
In-Reply-To: <20190115193950.25538-1-JPEWhacker@gmail.com>
On 1/15/19 8:39 PM, Joshua Watt wrote:
> Updates the output hash calculation for determining if tasks are
> equivalent. The new algorithm does the following based on feedback:
> 1) All files are printed in a single line tabular format
> 2) Prints the file type and mode in a user-friendly ls-like format
> 3) Includes the file owner and group (by name, not ID). These are only
> included if the task is run under pseudo, since that is the only
> time they can be consistently determined.
> 4) File size is included for regular files
>
> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> ---
> meta/classes/sstate.bbclass | 91 +++++++++++++++++++++++++++++++------
> 1 file changed, 76 insertions(+), 15 deletions(-)
>
> diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
> index 482ffa83f98..a103a759825 100644
> --- a/meta/classes/sstate.bbclass
> +++ b/meta/classes/sstate.bbclass
> @@ -784,6 +784,8 @@ python sstate_sign_package () {
> def OEOuthashBasic(path, sigfile, task, d):
> import hashlib
> import stat
> + import pwd
> + import grp
>
> def update_hash(s):
> s = s.encode('utf-8')
> @@ -793,6 +795,7 @@ def OEOuthashBasic(path, sigfile, task, d):
>
> h = hashlib.sha256()
> prev_dir = os.getcwd()
> + include_owners = os.environ.get('PSEUDO_DISABLED') == '0'
>
> try:
> os.chdir(path)
> @@ -807,34 +810,92 @@ def OEOuthashBasic(path, sigfile, task, d):
> update_hash("task=%s\n" % task)
>
> for root, dirs, files in os.walk('.', topdown=True):
> - # Sort directories and files to ensure consistent ordering
> + # Sort directories to ensure consistent ordering when recursing
> dirs.sort()
> files.sort()
>
> - for f in files:
> - path = os.path.join(root, f)
> + def process(path):
> s = os.lstat(path)
>
> - # Hash file path
> - update_hash(path + '\n')
> + if stat.S_ISDIR(s.st_mode):
> + update_hash('d')
> + elif stat.S_ISCHR(s.st_mode):
> + update_hash('c')
> + elif stat.S_ISBLK(s.st_mode):
> + update_hash('b')
> + elif stat.S_ISSOCK(s.st_mode):
> + update_hash('s')
> + elif stat.S_ISLNK(s.st_mode):
> + update_hash('l')
> + elif stat.S_ISFIFO(s.st_mode):
> + update_hash('p')
> + else:
> + update_hash('-')
> +
> + def add_perm(mask, on, off='-'):
> + if mask & s.st_mode:
> + update_hash(on)
> + else:
> + update_hash(off)
> +
> + add_perm(stat.S_IRUSR, 'r')
> + add_perm(stat.S_IWUSR, 'w')
> + if stat.S_ISUID & s.st_mode:
> + add_perm(stat.S_IXUSR, 's', 'S')
> + else:
> + add_perm(stat.S_IXUSR, 'x')
>
> - # Hash file mode
> - update_hash("\tmode=0x%x\n" % stat.S_IMODE(s.st_mode))
> - update_hash("\ttype=0x%x\n" % stat.S_IFMT(s.st_mode))
> + add_perm(stat.S_IRGRP, 'r')
> + add_perm(stat.S_IWGRP, 'w')
> + if stat.S_ISGID & s.st_mode:
> + add_perm(stat.S_IXGRP, 's', 'S')
> + else:
> + add_perm(stat.S_IXGRP, 'x')
>
> - if stat.S_ISBLK(s.st_mode) or stat.S_ISBLK(s.st_mode):
> - # Hash device major and minor
> - update_hash("\tdev=%d,%d\n" % (os.major(s.st_rdev), os.minor(s.st_rdev)))
> - elif stat.S_ISLNK(s.st_mode):
> - # Hash symbolic link
> - update_hash("\tsymlink=%s\n" % os.readlink(path))
> + add_perm(stat.S_IROTH, 'r')
> + add_perm(stat.S_IWOTH, 'w')
> + if stat.S_ISVTX & s.st_mode:
> + update_hash('t')
> + else:
> + add_perm(stat.S_IXOTH, 'x')
> +
> + if include_owners:
> + #update_hash(" %5d" % s.st_uid)
> + #update_hash(" %5d" % s.st_gid)
> + update_hash(" %10s" % pwd.getpwuid(s.st_uid).pw_name)
> + update_hash(" %10s" % grp.getgrgid(s.st_gid).gr_name)
> +
> + if stat.S_ISBLK(s.st_mode) or stat.S_ISCHR(s.st_mode):
> + update_hash(" %9s" % ("%d.%d" % (os.major(s.st_rdev), os.minor(s.st_rdev))))
> else:
> + update_hash(" " * 10)
> +
> + if stat.S_ISREG(s.st_mode):
> + update_hash(" %10d" % s.st_size)
> + else:
> + update_hash(" " * 11)
> +
> + update_hash(" %s" % path)
> +
> + if stat.S_ISLNK(s.st_mode):
> + update_hash(" -> %s" % os.readlink(path))
> +
> + if stat.S_ISREG(s.st_mode):
> fh = hashlib.sha256()
> # Hash file contents
> with open(path, 'rb') as d:
> for chunk in iter(lambda: d.read(4096), b""):
> fh.update(chunk)
> - update_hash("\tdigest=%s\n" % fh.hexdigest())
> + update_hash(" %s" % fh.hexdigest())
> +
> + update_hash("\n")
> +
> + # Process this directory and all its child files
> + process(root)
> + for f in files:
> + if f == 'fixmepath':
> + continue
> + process(os.path.join(root, f))
> finally:
> os.chdir(prev_dir)
>
>
Thanks for working on this Joshua.
It looks really nice, this is an example of the busybox depsig.do_package I get with this patch applied:
drwxrwxr-x root root .
drwxr-xr-x root root ./package
drwxr-xr-x root root ./package/bin
lrwxrwxrwx root root ./package/bin/busybox -> busybox.nosuid
-rwxr-xr-x root root 551388 ./package/bin/busybox.nosuid b50144c6a810bf92cbd442fd6f55794b6cdc8625a46f55c2e9d86ad22d75134a
-rwsr-xr-x root root 50860 ./package/bin/busybox.suid eb7af7e8f9e4a5bf6be7fb5ac16064ccd8f35e9890661134c5d73efbeb6e1d44
lrwxrwxrwx root root ./package/bin/sh -> busybox.nosuid
How about prepending the hashes first, printing just spaces or maybe doing '000...' for symlinks and directories ?
My 2 cents..
/Jacob
next prev parent reply other threads:[~2019-01-15 20:16 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-15 19:39 [PATCH] classes/sstate: Update output hash Joshua Watt
2019-01-15 20:16 ` Jacob Kroon [this message]
2019-01-15 20:49 ` Jacob Kroon
2019-01-15 22:00 ` Richard Purdie
2019-01-21 22:39 ` [PATCH v2] " Joshua Watt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=272c34a9-5806-eafa-7d2a-b44ef25d63cf@gmail.com \
--to=jacob.kroon@gmail.com \
--cc=jpewhacker@gmail.com \
--cc=openembedded-core@lists.openembedded.org \
--cc=peter.kjellerstedt@axis.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.