public inbox for openembedded-core@lists.openembedded.org
 help / color / mirror / Atom feed
From: Randolph Sapp <rs@ti.com>
To: <rs@ti.com>, <raj.khem@gmail.com>,
	<richard.purdie@linuxfoundation.org>,
	<mathieu.dubois-briand@bootlin.com>, <alex@linutronix.de>,
	<otavio@ossystems.com.br>, <kexin.hao@windriver.com>
Cc: <afd@ti.com>, <detheridge@ti.com>, <denis@denix.org>,
	<reatmon@ti.com>, <openembedded-core@lists.openembedded.org>,
	<vijayp@ti.com>
Subject: Re: [oe-core][PATCH] reproducible: fix git SOURCE_DATE_EPOCH randomness
Date: Thu, 12 Feb 2026 19:36:59 -0600	[thread overview]
Message-ID: <DGDG6LFLTR7X.2TCK86RQ93ME@ti.com> (raw)
In-Reply-To: <1893A7AF46371281.653184@lists.openembedded.org>

On Thu Feb 12, 2026 at 6:42 PM CST, Randolph Sapp via lists.openembedded.org wrote:
> From: Randolph Sapp <rs@ti.com>
>
> Anything that defines multiple git sources should have the largest value
> taken when calculating the SOURCE_DATE_EPOCH for a package.
>
> The previous iteration actually introduced some degree of randomness, as
> it would stop on the first git repository reported by os.walk, which
> does not assure any specific ordering by default.
>
> Signed-off-by: Randolph Sapp <rs@ti.com>
> ---
>
> To address issue reported here:
> https://lists.openembedded.org/g/openembedded-core/message/231076
>
>  meta/lib/oe/reproducible.py | 64 ++++++++++++++++---------------------
>  1 file changed, 28 insertions(+), 36 deletions(-)
>
> diff --git a/meta/lib/oe/reproducible.py b/meta/lib/oe/reproducible.py
> index 0270024a83..06ceda8d7f 100644
> --- a/meta/lib/oe/reproducible.py
> +++ b/meta/lib/oe/reproducible.py
> @@ -3,6 +3,7 @@
>  #
>  # SPDX-License-Identifier: GPL-2.0-only
>  #
> +import glob
>  import os
>  import subprocess
>  import bb
> @@ -74,52 +75,43 @@ def get_source_date_epoch_from_known_files(d, sourcedir):
>          bb.debug(1, "SOURCE_DATE_EPOCH taken from: %s" % newest_file)
>      return source_date_epoch
>
> -def find_git_folder(d, sourcedir):
> -    # First guess: UNPACKDIR/BB_GIT_DEFAULT_DESTSUFFIX
> -    # This is the default git fetcher unpack path
> +def find_git_folders(d, sourcedir):
>      unpackdir = d.getVar('UNPACKDIR')
> -    default_destsuffix = d.getVar('BB_GIT_DEFAULT_DESTSUFFIX')
> -    gitpath = os.path.join(unpackdir, default_destsuffix, ".git")
> -    if os.path.isdir(gitpath):
> -        return gitpath
> -
> -    # Second guess: ${S}
> -    gitpath = os.path.join(sourcedir, ".git")
> -    if os.path.isdir(gitpath):
> -        return gitpath
> -
> -    # Perhaps there was a subpath or destsuffix specified.
> -    # Go looking in the UNPACKDIR
> -    for root, dirs, files in os.walk(unpackdir, topdown=True):
> -        if '.git' in dirs:
> -            return os.path.join(root, ".git")
> +    git_folders = []
>
> -    for root, dirs, files in os.walk(sourcedir, topdown=True):
> -        if '.git' in dirs:
> -            return os.path.join(root, ".git")
> +    for mainpath in (sourcedir, unpackdir):
> +        gitpath_glob = os.path.join(mainpath, "**/.git/")
> +        for gitpath in glob.glob(gitpath_glob, recursive=True):
> +            git_folders.append(gitpath)

I honestly don't know if recursively searching the deploy directory is any worse
than instantiating a new fetcher and walking the SRC_URI and destsuffix values
directly. They both feel a little heavy-handed.

> -    bb.warn("Failed to find a git repository in UNPACKDIR: %s" % unpackdir)
> -    return None
> +    if not git_folders:
> +        bb.warn("Failed to find any git repository in UNPACKDIR or S")
> +
> +    return git_folders
>
>  def get_source_date_epoch_from_git(d, sourcedir):
>      if not "git://" in d.getVar('SRC_URI') and not "gitsm://" in d.getVar('SRC_URI'):
>          return None
>
> -    gitpath = find_git_folder(d, sourcedir)
> -    if not gitpath:
> -        return None
> +    # Get an epoch from all valid git repositoies
> +    sources_dates = []
> +    for gitpath in find_git_folders(d, sourcedir):
> +        # Check that the repository has a valid HEAD; it may not if subdir is used
> +        # in SRC_URI
> +        p = subprocess.run(['git', '--git-dir', gitpath, 'rev-parse', 'HEAD'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
> +        if p.returncode != 0:
> +            bb.debug(1, "%s does not have a valid HEAD: %s" % (gitpath, p.stdout.decode('utf-8')))
> +            continue
>
> -    # Check that the repository has a valid HEAD; it may not if subdir is used
> -    # in SRC_URI
> -    p = subprocess.run(['git', '--git-dir', gitpath, 'rev-parse', 'HEAD'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
> -    if p.returncode != 0:
> -        bb.debug(1, "%s does not have a valid HEAD: %s" % (gitpath, p.stdout.decode('utf-8')))
> -        return None
> +        bb.debug(1, "git repository: %s" % gitpath)
> +        p = subprocess.run(['git', '-c', 'log.showSignature=false', '--git-dir', gitpath, 'log', '-1', '--pretty=%ct'],
> +                           check=True, stdout=subprocess.PIPE)
> +        sources_dates.append(int(p.stdout.decode('utf-8')))
> +
> +    if sources_dates:
> +        return sorted(sources_dates, reverse=True)[0]
>
> -    bb.debug(1, "git repository: %s" % gitpath)
> -    p = subprocess.run(['git', '-c', 'log.showSignature=false', '--git-dir', gitpath, 'log', '-1', '--pretty=%ct'],
> -                       check=True, stdout=subprocess.PIPE)
> -    return int(p.stdout.decode('utf-8'))
> +    return None
>
>  def get_source_date_epoch_from_youngest_file(d, sourcedir):
>      if sourcedir == d.getVar('UNPACKDIR'):
> --
> 2.52.0


       reply	other threads:[~2026-02-13  1:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1893A7AF46371281.653184@lists.openembedded.org>
2026-02-13  1:36 ` Randolph Sapp [this message]
2026-02-13  0:42 [oe-core][PATCH] reproducible: fix git SOURCE_DATE_EPOCH randomness rs
2026-02-14  9:53 ` Mathieu Dubois-Briand
2026-02-17 19:48   ` Randolph Sapp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DGDG6LFLTR7X.2TCK86RQ93ME@ti.com \
    --to=rs@ti.com \
    --cc=afd@ti.com \
    --cc=alex@linutronix.de \
    --cc=denis@denix.org \
    --cc=detheridge@ti.com \
    --cc=kexin.hao@windriver.com \
    --cc=mathieu.dubois-briand@bootlin.com \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=otavio@ossystems.com.br \
    --cc=raj.khem@gmail.com \
    --cc=reatmon@ti.com \
    --cc=richard.purdie@linuxfoundation.org \
    --cc=vijayp@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox