Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Jacob Kroon <jacob.kroon@gmail.com>
To: Joshua Watt <jpewhacker@gmail.com>,
	openembedded-core@lists.openembedded.org
Cc: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
Subject: Re: [PATCH] classes/sstate: Update output hash
Date: Tue, 15 Jan 2019 21:16:57 +0100	[thread overview]
Message-ID: <272c34a9-5806-eafa-7d2a-b44ef25d63cf@gmail.com> (raw)
In-Reply-To: <20190115193950.25538-1-JPEWhacker@gmail.com>



On 1/15/19 8:39 PM, Joshua Watt wrote:
> Updates the output hash calculation for determining if tasks are
> equivalent. The new algorithm does the following based on feedback:
>   1) All files are printed in a single line tabular format
>   2) Prints the file type and mode in a user-friendly ls-like format
>   3) Includes the file owner and group (by name, not ID). These are only
>      included if the task is run under pseudo, since that is the only
>      time they can be consistently determined.
>   4) File size is included for regular files
> 
> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> ---
>   meta/classes/sstate.bbclass | 91 +++++++++++++++++++++++++++++++------
>   1 file changed, 76 insertions(+), 15 deletions(-)
> 
> diff --git a/meta/classes/sstate.bbclass b/meta/classes/sstate.bbclass
> index 482ffa83f98..a103a759825 100644
> --- a/meta/classes/sstate.bbclass
> +++ b/meta/classes/sstate.bbclass
> @@ -784,6 +784,8 @@ python sstate_sign_package () {
>   def OEOuthashBasic(path, sigfile, task, d):
>       import hashlib
>       import stat
> +    import pwd
> +    import grp
>   
>       def update_hash(s):
>           s = s.encode('utf-8')
> @@ -793,6 +795,7 @@ def OEOuthashBasic(path, sigfile, task, d):
>   
>       h = hashlib.sha256()
>       prev_dir = os.getcwd()
> +    include_owners = os.environ.get('PSEUDO_DISABLED') == '0'
>   
>       try:
>           os.chdir(path)
> @@ -807,34 +810,92 @@ def OEOuthashBasic(path, sigfile, task, d):
>           update_hash("task=%s\n" % task)
>   
>           for root, dirs, files in os.walk('.', topdown=True):
> -            # Sort directories and files to ensure consistent ordering
> +            # Sort directories to ensure consistent ordering when recursing
>               dirs.sort()
>               files.sort()
>   
> -            for f in files:
> -                path = os.path.join(root, f)
> +            def process(path):
>                   s = os.lstat(path)
>   
> -                # Hash file path
> -                update_hash(path + '\n')
> +                if stat.S_ISDIR(s.st_mode):
> +                    update_hash('d')
> +                elif stat.S_ISCHR(s.st_mode):
> +                    update_hash('c')
> +                elif stat.S_ISBLK(s.st_mode):
> +                    update_hash('b')
> +                elif stat.S_ISSOCK(s.st_mode):
> +                    update_hash('s')
> +                elif stat.S_ISLNK(s.st_mode):
> +                    update_hash('l')
> +                elif stat.S_ISFIFO(s.st_mode):
> +                    update_hash('p')
> +                else:
> +                    update_hash('-')
> +
> +                def add_perm(mask, on, off='-'):
> +                    if mask & s.st_mode:
> +                        update_hash(on)
> +                    else:
> +                        update_hash(off)
> +
> +                add_perm(stat.S_IRUSR, 'r')
> +                add_perm(stat.S_IWUSR, 'w')
> +                if stat.S_ISUID & s.st_mode:
> +                    add_perm(stat.S_IXUSR, 's', 'S')
> +                else:
> +                    add_perm(stat.S_IXUSR, 'x')
>   
> -                # Hash file mode
> -                update_hash("\tmode=0x%x\n" % stat.S_IMODE(s.st_mode))
> -                update_hash("\ttype=0x%x\n" % stat.S_IFMT(s.st_mode))
> +                add_perm(stat.S_IRGRP, 'r')
> +                add_perm(stat.S_IWGRP, 'w')
> +                if stat.S_ISGID & s.st_mode:
> +                    add_perm(stat.S_IXGRP, 's', 'S')
> +                else:
> +                    add_perm(stat.S_IXGRP, 'x')
>   
> -                if stat.S_ISBLK(s.st_mode) or stat.S_ISBLK(s.st_mode):
> -                    # Hash device major and minor
> -                    update_hash("\tdev=%d,%d\n" % (os.major(s.st_rdev), os.minor(s.st_rdev)))
> -                elif stat.S_ISLNK(s.st_mode):
> -                    # Hash symbolic link
> -                    update_hash("\tsymlink=%s\n" % os.readlink(path))
> +                add_perm(stat.S_IROTH, 'r')
> +                add_perm(stat.S_IWOTH, 'w')
> +                if stat.S_ISVTX & s.st_mode:
> +                    update_hash('t')
> +                else:
> +                    add_perm(stat.S_IXOTH, 'x')
> +
> +                if include_owners:
> +                    #update_hash(" %5d" % s.st_uid)
> +                    #update_hash(" %5d" % s.st_gid)
> +                    update_hash(" %10s" % pwd.getpwuid(s.st_uid).pw_name)
> +                    update_hash(" %10s" % grp.getgrgid(s.st_gid).gr_name)
> +
> +                if stat.S_ISBLK(s.st_mode) or stat.S_ISCHR(s.st_mode):
> +                    update_hash(" %9s" % ("%d.%d" % (os.major(s.st_rdev), os.minor(s.st_rdev))))
>                   else:
> +                    update_hash(" " * 10)
> +
> +                if stat.S_ISREG(s.st_mode):
> +                    update_hash(" %10d" % s.st_size)
> +                else:
> +                    update_hash(" " * 11)
> +
> +                update_hash(" %s" % path)
> +
> +                if stat.S_ISLNK(s.st_mode):
> +                    update_hash(" -> %s" % os.readlink(path))
> +
> +                if stat.S_ISREG(s.st_mode):
>                       fh = hashlib.sha256()
>                       # Hash file contents
>                       with open(path, 'rb') as d:
>                           for chunk in iter(lambda: d.read(4096), b""):
>                               fh.update(chunk)
> -                    update_hash("\tdigest=%s\n" % fh.hexdigest())
> +                    update_hash(" %s" % fh.hexdigest())
> +
> +                update_hash("\n")
> +
> +            # Process this directory and all its child files
> +            process(root)
> +            for f in files:
> +                if f == 'fixmepath':
> +                    continue
> +                process(os.path.join(root, f))
>       finally:
>           os.chdir(prev_dir)
>   
> 

Thanks for working on this Joshua.

It looks really nice, this is an example of the busybox depsig.do_package I get with this patch applied:

drwxrwxr-x       root       root                      .
drwxr-xr-x       root       root                      ./package
drwxr-xr-x       root       root                      ./package/bin
lrwxrwxrwx       root       root                      ./package/bin/busybox -> busybox.nosuid
-rwxr-xr-x       root       root               551388 ./package/bin/busybox.nosuid b50144c6a810bf92cbd442fd6f55794b6cdc8625a46f55c2e9d86ad22d75134a
-rwsr-xr-x       root       root                50860 ./package/bin/busybox.suid eb7af7e8f9e4a5bf6be7fb5ac16064ccd8f35e9890661134c5d73efbeb6e1d44
lrwxrwxrwx       root       root                      ./package/bin/sh -> busybox.nosuid

How about prepending the hashes first, printing just spaces or maybe doing '000...' for symlinks and directories ?

My 2 cents..
/Jacob


  reply	other threads:[~2019-01-15 20:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-15 19:39 [PATCH] classes/sstate: Update output hash Joshua Watt
2019-01-15 20:16 ` Jacob Kroon [this message]
2019-01-15 20:49 ` Jacob Kroon
2019-01-15 22:00 ` Richard Purdie
2019-01-21 22:39 ` [PATCH v2] " Joshua Watt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=272c34a9-5806-eafa-7d2a-b44ef25d63cf@gmail.com \
    --to=jacob.kroon@gmail.com \
    --cc=jpewhacker@gmail.com \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=peter.kjellerstedt@axis.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox