From: Paul Barker <paul@pbarker.dev>
To: rs@ti.com, raj.khem@gmail.com,
richard.purdie@linuxfoundation.org,
mathieu.dubois-briand@bootlin.com, alex@linutronix.de,
otavio@ossystems.com.br, kexin.hao@windriver.com
Cc: afd@ti.com, detheridge@ti.com, denis@denix.org, reatmon@ti.com,
openembedded-core@lists.openembedded.org, vijayp@ti.com
Subject: Re: [oe-core][PATCHv2] reproducible: fix git SOURCE_DATE_EPOCH randomness
Date: Thu, 19 Feb 2026 15:01:38 +0000 [thread overview]
Message-ID: <e745402979f61ca6710e44b2116c6485ca0484ff.camel@pbarker.dev> (raw)
In-Reply-To: <20260217200105.2234389-2-rs@ti.com>
[-- Attachment #1: Type: text/plain, Size: 4829 bytes --]
On Tue, 2026-02-17 at 14:01 -0600, Randolph Sapp via
lists.openembedded.org wrote:
> From: Randolph Sapp <rs@ti.com>
>
> Anything that defines multiple git sources should have the largest value
> taken when calculating the SOURCE_DATE_EPOCH for a package.
>
> The previous iteration actually introduced some degree of randomness, as
> it would stop on the first git repository reported by os.walk, which
> does not assure any specific ordering by default.
>
> Signed-off-by: Randolph Sapp <rs@ti.com>
> ---
>
> v2: Use os.walk method as opposed to glob to avoid infinite recursion when
> navigating symbolic links
>
> meta/lib/oe/reproducible.py | 63 ++++++++++++++++---------------------
> 1 file changed, 27 insertions(+), 36 deletions(-)
>
> diff --git a/meta/lib/oe/reproducible.py b/meta/lib/oe/reproducible.py
> index 0270024a83..c58db48fb1 100644
> --- a/meta/lib/oe/reproducible.py
> +++ b/meta/lib/oe/reproducible.py
> @@ -74,52 +74,43 @@ def get_source_date_epoch_from_known_files(d, sourcedir):
> bb.debug(1, "SOURCE_DATE_EPOCH taken from: %s" % newest_file)
> return source_date_epoch
>
> -def find_git_folder(d, sourcedir):
> - # First guess: UNPACKDIR/BB_GIT_DEFAULT_DESTSUFFIX
> - # This is the default git fetcher unpack path
> +def find_git_folders(d, sourcedir):
> unpackdir = d.getVar('UNPACKDIR')
> - default_destsuffix = d.getVar('BB_GIT_DEFAULT_DESTSUFFIX')
> - gitpath = os.path.join(unpackdir, default_destsuffix, ".git")
> - if os.path.isdir(gitpath):
> - return gitpath
> -
> - # Second guess: ${S}
> - gitpath = os.path.join(sourcedir, ".git")
> - if os.path.isdir(gitpath):
> - return gitpath
> -
> - # Perhaps there was a subpath or destsuffix specified.
> - # Go looking in the UNPACKDIR
> - for root, dirs, files in os.walk(unpackdir, topdown=True):
> - if '.git' in dirs:
> - return os.path.join(root, ".git")
> + git_folders = []
>
> - for root, dirs, files in os.walk(sourcedir, topdown=True):
> - if '.git' in dirs:
> - return os.path.join(root, ".git")
> + for mainpath in (sourcedir, unpackdir):
> + for root, dirs, _ in os.walk(mainpath, topdown=True):
> + if ".git" in dirs:
Do we need to add handling for git submodules? In submodules, '.git' is
a file instead of a directory.
> + git_folders.append(os.path.join(root, ".git"))
We should change this to `git_folders.append(root)` (see below).
>
> - bb.warn("Failed to find a git repository in UNPACKDIR: %s" % unpackdir)
> - return None
> + if not git_folders:
> + bb.warn("Failed to find any git repository in UNPACKDIR or S")
> +
> + return git_folders
>
> def get_source_date_epoch_from_git(d, sourcedir):
> if not "git://" in d.getVar('SRC_URI') and not "gitsm://" in d.getVar('SRC_URI'):
> return None
>
> - gitpath = find_git_folder(d, sourcedir)
> - if not gitpath:
> - return None
> + # Get an epoch from all valid git repositoies
> + sources_dates = []
> + for gitpath in find_git_folders(d, sourcedir):
> + # Check that the repository has a valid HEAD; it may not if subdir is used
> + # in SRC_URI
> + p = subprocess.run(['git', '--git-dir', gitpath, 'rev-parse', 'HEAD'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
Using '--git-dir' does not set the path to the worktree correctly. This
may work, but it's fragile. While we're modifying things here, can we
change find_git_folders() to return the paths of the repository roots
instead of the .git directories? Then we can use 'git -C path ...' here,
which is much less likely to have issues in the future?
> + if p.returncode != 0:
> + bb.debug(1, "%s does not have a valid HEAD: %s" % (gitpath, p.stdout.decode('utf-8')))
> + continue
>
> - # Check that the repository has a valid HEAD; it may not if subdir is used
> - # in SRC_URI
> - p = subprocess.run(['git', '--git-dir', gitpath, 'rev-parse', 'HEAD'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
> - if p.returncode != 0:
> - bb.debug(1, "%s does not have a valid HEAD: %s" % (gitpath, p.stdout.decode('utf-8')))
> - return None
> + bb.debug(1, "git repository: %s" % gitpath)
> + p = subprocess.run(['git', '-c', 'log.showSignature=false', '--git-dir', gitpath, 'log', '-1', '--pretty=%ct'],
> + check=True, stdout=subprocess.PIPE)
> + sources_dates.append(int(p.stdout.decode('utf-8')))
> +
> + if sources_dates:
> + return sorted(sources_dates, reverse=True)[0]
Can we use `max(sources_dates)` here?
Best regards,
--
Paul Barker
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 252 bytes --]
next prev parent reply other threads:[~2026-02-19 15:01 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-17 20:01 [oe-core][PATCHv2] reproducible: fix git SOURCE_DATE_EPOCH randomness rs
2026-02-19 15:01 ` Paul Barker [this message]
2026-02-20 1:26 ` Randolph Sapp
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e745402979f61ca6710e44b2116c6485ca0484ff.camel@pbarker.dev \
--to=paul@pbarker.dev \
--cc=afd@ti.com \
--cc=alex@linutronix.de \
--cc=denis@denix.org \
--cc=detheridge@ti.com \
--cc=kexin.hao@windriver.com \
--cc=mathieu.dubois-briand@bootlin.com \
--cc=openembedded-core@lists.openembedded.org \
--cc=otavio@ossystems.com.br \
--cc=raj.khem@gmail.com \
--cc=reatmon@ti.com \
--cc=richard.purdie@linuxfoundation.org \
--cc=rs@ti.com \
--cc=vijayp@ti.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox