* [PATCH v1 0/2] oe: Fix build failures with multiple git SRC_URI entries
@ 2026-05-15 9:36 Jamin Lin
2026-05-15 9:36 ` [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple " Jamin Lin
2026-05-15 9:36 ` [PATCH v1 2/2] reproducible: Handle nested git repos in find_git_repositories Jamin Lin
0 siblings, 2 replies; 11+ messages in thread
From: Jamin Lin @ 2026-05-15 9:36 UTC (permalink / raw)
To: openembedded-core@lists.openembedded.org; +Cc: Troy Lee, Jamin Lin, Vince Chang
Some recipes (e.g. Zephyr-based) use multiple git SRC_URI entries with
different destsuffix values, causing each source to be unpacked into a
separate subdirectory of EXTERNALSRC that retains its own .git directory.
These nested git repositories trigger two independent failures:
1. externalsrc.bbclass: 'git add -A .' exits with code 128 during
srctree_hash_files(), halting the bitbake parse phase.
2. oe/reproducible.py: 'git log -1' exits with code 128 inside a nested
repo found by find_git_repositories(), aborting do_unpack.
Jamin Lin (2):
externalsrc: Handle nested git repos from multiple SRC_URI entries
reproducible: Handle nested git repos in find_git_repositories
meta/classes/externalsrc.bbclass | 37 +++++++++++++++++++++++++++++++-
meta/lib/oe/reproducible.py | 6 +++++-
2 files changed, 41 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple SRC_URI entries
2026-05-15 9:36 [PATCH v1 0/2] oe: Fix build failures with multiple git SRC_URI entries Jamin Lin
@ 2026-05-15 9:36 ` Jamin Lin
2026-05-18 19:16 ` Paul Barker
2026-05-15 9:36 ` [PATCH v1 2/2] reproducible: Handle nested git repos in find_git_repositories Jamin Lin
1 sibling, 1 reply; 11+ messages in thread
From: Jamin Lin @ 2026-05-15 9:36 UTC (permalink / raw)
To: openembedded-core@lists.openembedded.org; +Cc: Troy Lee, Jamin Lin, Vince Chang
When a recipe uses multiple git SRC_URI entries with different
destsuffix values (e.g. Zephyr-based recipes with separate repos for
the kernel, modules, and application), each source is unpacked into a
subdirectory of EXTERNALSRC that retains its own .git directory.
srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
which fails with exit code 128 when git encounters these unregistered
nested git repositories, halting the bitbake parse phase.
Fix by scanning for nested git repos before the add. If any are found,
exclude them from the top-level 'git add' using pathspec magic
':(exclude)<path>' and hash each nested repo independently using a
temporary index. This ensures changes in any nested repo still trigger
do_compile/do_configure to re-run.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
---
meta/classes/externalsrc.bbclass | 37 +++++++++++++++++++++++++++++++-
1 file changed, 36 insertions(+), 1 deletion(-)
diff --git a/meta/classes/externalsrc.bbclass b/meta/classes/externalsrc.bbclass
index 902ff2604f..0dd57af668 100644
--- a/meta/classes/externalsrc.bbclass
+++ b/meta/classes/externalsrc.bbclass
@@ -234,8 +234,43 @@ def srctree_hash_files(d, srcdir=None):
# Update our custom index
env = os.environ.copy()
env['GIT_INDEX_FILE'] = tmp_index.name
- subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir, env=env)
+ # Find nested git repos created by multiple SRC_URI git entries with
+ # different destsuffix values. git add -A . exits 128 when it encounters
+ # these unregistered nested repos.
+ nested_git_dirs = []
+ for root, dirs, files in os.walk(s_dir):
+ if root == s_dir:
+ continue
+ if '.git' in dirs or '.git' in files:
+ nested_git_dirs.append(root)
+ dirs[:] = [] # don't recurse into nested repos
+ if nested_git_dirs:
+ excludes = [':(exclude)' + os.path.relpath(n, s_dir) for n in nested_git_dirs]
+ subprocess.check_output(['git', 'add', '-A', '.'] + excludes, cwd=s_dir, env=env)
+ else:
+ subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir, env=env)
git_sha1 = subprocess.check_output(['git', 'write-tree'], cwd=s_dir, env=env).decode("utf-8")
+ # Hash each nested git repo separately so source changes there still
+ # trigger do_compile/do_configure to re-run.
+ for nested in nested_git_dirs:
+ nested_git = os.path.join(nested, '.git')
+ if not os.path.isdir(nested_git):
+ continue
+ with tempfile.NamedTemporaryFile(prefix='oe-devtool-nested-index') as nested_tmp:
+ nested_index = os.path.join(nested_git, 'index')
+ if os.path.exists(nested_index):
+ shutil.copyfile(nested_index, nested_tmp.name)
+ nested_env = os.environ.copy()
+ nested_env['GIT_INDEX_FILE'] = nested_tmp.name
+ proc = subprocess.Popen(['git', 'add', '-A', '.'], cwd=nested,
+ env=nested_env, stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL)
+ proc.communicate()
+ proc = subprocess.Popen(['git', 'write-tree'], cwd=nested,
+ env=nested_env, stdout=subprocess.PIPE,
+ stderr=subprocess.DEVNULL)
+ stdout, _ = proc.communicate()
+ git_sha1 += stdout.decode("utf-8")
if os.path.exists(os.path.join(s_dir, ".gitmodules")) and os.path.getsize(os.path.join(s_dir, ".gitmodules")) > 0:
submodule_helper = subprocess.check_output(["git", "config", "--file", ".gitmodules", "--get-regexp", "path"], cwd=s_dir, env=env).decode("utf-8")
for line in submodule_helper.splitlines():
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v1 2/2] reproducible: Handle nested git repos in find_git_repositories
2026-05-15 9:36 [PATCH v1 0/2] oe: Fix build failures with multiple git SRC_URI entries Jamin Lin
2026-05-15 9:36 ` [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple " Jamin Lin
@ 2026-05-15 9:36 ` Jamin Lin
2026-05-18 19:22 ` Paul Barker
1 sibling, 1 reply; 11+ messages in thread
From: Jamin Lin @ 2026-05-15 9:36 UTC (permalink / raw)
To: openembedded-core@lists.openembedded.org; +Cc: Troy Lee, Jamin Lin, Vince Chang
When EXTERNALSRC contains multiple nested git repositories (from
multiple SRC_URI git entries with different destsuffix values),
find_git_repositories() walks into sub-repos and
get_source_date_epoch_from_git() subsequently fails with exit code 128
when running 'git log -1' inside them.
Two fixes:
- Stop os.walk recursion when a .git entry is found (dirs[:] = []) to
avoid descending into nested repos.
- Change 'git log -1' from check=True to check=False with explicit
error handling, so a failing nested repo is skipped gracefully
instead of raising CalledProcessError and aborting do_unpack.
Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
---
meta/lib/oe/reproducible.py | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/meta/lib/oe/reproducible.py b/meta/lib/oe/reproducible.py
index a80376010a..6bb25da55a 100644
--- a/meta/lib/oe/reproducible.py
+++ b/meta/lib/oe/reproducible.py
@@ -82,6 +82,7 @@ def find_git_repositories(d, sourcedir):
for root, dirs, files in os.walk(mainpath, topdown=True):
if '.git' in dirs or '.git' in files:
git_repositories.append(root)
+ dirs[:] = [] # don't recurse into nested git repos (multiple SRC_URI destsuffix)
if not git_repositories:
bb.warn('Failed to find any git repositories in UNPACKDIR or S')
@@ -105,7 +106,10 @@ def get_source_date_epoch_from_git(d, sourcedir):
bb.debug(1, "git repository: %s" % repo_path)
p = subprocess.run(['git', '-C', repo_path, 'log', '-1', '--pretty=%ct'],
- check=True, stdout=subprocess.PIPE)
+ check=False, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+ if p.returncode != 0:
+ bb.debug(1, "git log failed for %s (exit %d): %s" % (repo_path, p.returncode, p.stdout.decode('utf-8')))
+ continue
source_dates.append(int(p.stdout.decode('utf-8')))
if source_dates:
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple SRC_URI entries
2026-05-15 9:36 ` [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple " Jamin Lin
@ 2026-05-18 19:16 ` Paul Barker
2026-05-19 6:28 ` Jamin Lin
2026-05-19 7:07 ` Jamin Lin
0 siblings, 2 replies; 11+ messages in thread
From: Paul Barker @ 2026-05-18 19:16 UTC (permalink / raw)
To: Jamin Lin, openembedded-core@lists.openembedded.org; +Cc: Troy Lee, Vince Chang
[-- Attachment #1: Type: text/plain, Size: 4816 bytes --]
On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> When a recipe uses multiple git SRC_URI entries with different
> destsuffix values (e.g. Zephyr-based recipes with separate repos for
> the kernel, modules, and application), each source is unpacked into a
> subdirectory of EXTERNALSRC that retains its own .git directory.
>
> srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
> which fails with exit code 128 when git encounters these unregistered
> nested git repositories, halting the bitbake parse phase.
Is this true? The documentation for `git add` [1] talks about issuing a
warning when this occurs, not an error, and in some quick local testing
I get a successful exit (exit code 0) when I try this.
> Fix by scanning for nested git repos before the add. If any are found,
> exclude them from the top-level 'git add' using pathspec magic
> ':(exclude)<path>' and hash each nested repo independently using a
> temporary index. This ensures changes in any nested repo still trigger
> do_compile/do_configure to re-run.
>
> Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
> ---
> meta/classes/externalsrc.bbclass | 37 +++++++++++++++++++++++++++++++-
> 1 file changed, 36 insertions(+), 1 deletion(-)
>
> diff --git a/meta/classes/externalsrc.bbclass b/meta/classes/externalsrc.bbclass
> index 902ff2604f..0dd57af668 100644
> --- a/meta/classes/externalsrc.bbclass
> +++ b/meta/classes/externalsrc.bbclass
> @@ -234,8 +234,43 @@ def srctree_hash_files(d, srcdir=None):
> # Update our custom index
> env = os.environ.copy()
> env['GIT_INDEX_FILE'] = tmp_index.name
> - subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir, env=env)
> + # Find nested git repos created by multiple SRC_URI git entries with
> + # different destsuffix values. git add -A . exits 128 when it encounters
> + # these unregistered nested repos.
> + nested_git_dirs = []
> + for root, dirs, files in os.walk(s_dir):
> + if root == s_dir:
> + continue
> + if '.git' in dirs or '.git' in files:
> + nested_git_dirs.append(root)
> + dirs[:] = [] # don't recurse into nested repos
This os.walk() loop is expensive, is there an alternative way to handle
this?
The code has also become difficult to parse. My rule of thumb is that if
a group of lines needs a leading comment, it also needs an empty line
before the comment to visually separate things.
> + if nested_git_dirs:
> + excludes = [':(exclude)' + os.path.relpath(n, s_dir) for n in nested_git_dirs]
> + subprocess.check_output(['git', 'add', '-A', '.'] + excludes, cwd=s_dir, env=env)
> + else:
> + subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir, env=env)
To simplify the code, construct a cmd variable and call
subprocess.check_output(cmd, ...) once.
> git_sha1 = subprocess.check_output(['git', 'write-tree'], cwd=s_dir, env=env).decode("utf-8")
> + # Hash each nested git repo separately so source changes there still
> + # trigger do_compile/do_configure to re-run.
> + for nested in nested_git_dirs:
> + nested_git = os.path.join(nested, '.git')
> + if not os.path.isdir(nested_git):
> + continue
> + with tempfile.NamedTemporaryFile(prefix='oe-devtool-nested-index') as nested_tmp:
> + nested_index = os.path.join(nested_git, 'index')
> + if os.path.exists(nested_index):
> + shutil.copyfile(nested_index, nested_tmp.name)
> + nested_env = os.environ.copy()
> + nested_env['GIT_INDEX_FILE'] = nested_tmp.name
> + proc = subprocess.Popen(['git', 'add', '-A', '.'], cwd=nested,
> + env=nested_env, stdout=subprocess.DEVNULL,
> + stderr=subprocess.DEVNULL)
> + proc.communicate()
> + proc = subprocess.Popen(['git', 'write-tree'], cwd=nested,
> + env=nested_env, stdout=subprocess.PIPE,
> + stderr=subprocess.DEVNULL)
> + stdout, _ = proc.communicate()
> + git_sha1 += stdout.decode("utf-8")
We should re-use the code from the following block which handles
submodules instead of re-implementing the behaviour. Perhaps the common
code needs to be refactored out.
Best regards,
--
Paul Barker
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 252 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v1 2/2] reproducible: Handle nested git repos in find_git_repositories
2026-05-15 9:36 ` [PATCH v1 2/2] reproducible: Handle nested git repos in find_git_repositories Jamin Lin
@ 2026-05-18 19:22 ` Paul Barker
0 siblings, 0 replies; 11+ messages in thread
From: Paul Barker @ 2026-05-18 19:22 UTC (permalink / raw)
To: Jamin Lin, openembedded-core@lists.openembedded.org; +Cc: Troy Lee, Vince Chang
[-- Attachment #1: Type: text/plain, Size: 2628 bytes --]
On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> When EXTERNALSRC contains multiple nested git repositories (from
> multiple SRC_URI git entries with different destsuffix values),
> find_git_repositories() walks into sub-repos and
> get_source_date_epoch_from_git() subsequently fails with exit code 128
> when running 'git log -1' inside them.
>
> Two fixes:
> - Stop os.walk recursion when a .git entry is found (dirs[:] = []) to
> avoid descending into nested repos.
> - Change 'git log -1' from check=True to check=False with explicit
> error handling, so a failing nested repo is skipped gracefully
> instead of raising CalledProcessError and aborting do_unpack.
>
> Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
> ---
> meta/lib/oe/reproducible.py | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/meta/lib/oe/reproducible.py b/meta/lib/oe/reproducible.py
> index a80376010a..6bb25da55a 100644
> --- a/meta/lib/oe/reproducible.py
> +++ b/meta/lib/oe/reproducible.py
> @@ -82,6 +82,7 @@ def find_git_repositories(d, sourcedir):
> for root, dirs, files in os.walk(mainpath, topdown=True):
> if '.git' in dirs or '.git' in files:
> git_repositories.append(root)
> + dirs[:] = [] # don't recurse into nested git repos (multiple SRC_URI destsuffix)
>
> if not git_repositories:
> bb.warn('Failed to find any git repositories in UNPACKDIR or S')
> @@ -105,7 +106,10 @@ def get_source_date_epoch_from_git(d, sourcedir):
>
> bb.debug(1, "git repository: %s" % repo_path)
> p = subprocess.run(['git', '-C', repo_path, 'log', '-1', '--pretty=%ct'],
> - check=True, stdout=subprocess.PIPE)
> + check=False, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
> + if p.returncode != 0:
> + bb.debug(1, "git log failed for %s (exit %d): %s" % (repo_path, p.returncode, p.stdout.decode('utf-8')))
> + continue
> source_dates.append(int(p.stdout.decode('utf-8')))
This causes the repository to be silently (in most cases, since
bb.debug() output is not usually printed) dropped from the source_dates
array. Does this invalidate the returned source date epoch?
We may want a warning here instead of a debug print.
Also note that there may be other reasons for the `git log` command to
fail, and the output that identifies why it failed is likely to be in
stderr rather than stdout so we don't want to throw that way.
Best regards,
--
Paul Barker
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 252 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple SRC_URI entries
2026-05-18 19:16 ` Paul Barker
@ 2026-05-19 6:28 ` Jamin Lin
2026-05-19 10:04 ` [OE-core] " Paul Barker
2026-05-19 7:07 ` Jamin Lin
1 sibling, 1 reply; 11+ messages in thread
From: Jamin Lin @ 2026-05-19 6:28 UTC (permalink / raw)
To: Paul Barker, openembedded-core@lists.openembedded.org
Cc: Troy Lee, Vince Chang
Hi Paul,
> Subject: Re: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple
> SRC_URI entries
>
> On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> > When a recipe uses multiple git SRC_URI entries with different
> > destsuffix values (e.g. Zephyr-based recipes with separate repos for
> > the kernel, modules, and application), each source is unpacked into a
> > subdirectory of EXTERNALSRC that retains its own .git directory.
> >
> > srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
> > which fails with exit code 128 when git encounters these unregistered
> > nested git repositories, halting the bitbake parse phase.
>
> Is this true? The documentation for `git add` [1] talks about issuing a warning
> when this occurs, not an error, and in some quick local testing I get a successful
> exit (exit code 0) when I try this.
>
Please try the "zephyr-helloworld" package.
This recipe uses multiple Git repositories in "SRC_URI", and I can reproduce the issue with this package:
https://git.yoctoproject.org/meta-zephyr/tree/meta-zephyr-core/recipes-kernel/zephyr-kernel/zephyr-kernel-src-4.3.0.inc
Steps to reproduce:
1. git clone https://github.com/openembedded/bitbake.git
2. git clone https://github.com/openembedded/openembedded-core.git
3. git clone https://git.yoctoproject.org/meta-zephyr
4. git clone https://github.com/openembedded/meta-openembedded.git
5. source openembedded-core/oe-init-build-env build
6. Edit build/conf/bblayers.conf and add the following layers:
* meta-openembedded/meta-python
* meta-zephyr/meta-zephyr-core
* meta-openembedded/meta-oe
7. devtool modify zephyr-helloworld
8. bitbake -c cleanall zephyr-helloworld
9. bitbake zephyr-helloworld
After that, the following parser error occurs, and the package can no longer be built:
```text
jamin_lin@aspeed-fw02:~/oe-review/build$ bitbake zephyr-helloworld
Loading cache: 100% |#########################################################################################################################################################################################################| Time: 0:00:01
Loaded 4949 entries from dependency cache.
ERROR: ExpansionError during parsing /home/jamin_lin/oe-review/meta-zephyr/meta-zephyr-core/recipes-kernel/zephyr-kernel/zephyr-helloworld.bb
bb.data_smart.ExpansionError: Failure expanding variable do_compile[file-checksums], expression was ${@srctree_hash_files(d)} which triggered exception CalledProcessError: Command '['git', 'add', '-A', '.']' returned non-zero exit status 128.
The variable dependency chain for the failure is: do_compile[file-checksums]
ERROR: Parsing halted due to errors, see error messages above
Parsing recipes: 0% | | ETA: --:--:--
Summary: There were 2 ERROR messages, returning a non-zero exit code.
```
I can consistently reproduce this issue with the steps above.
Thanks,
Jamin
> > Fix by scanning for nested git repos before the add. If any are found,
> > exclude them from the top-level 'git add' using pathspec magic
> > ':(exclude)<path>' and hash each nested repo independently using a
> > temporary index. This ensures changes in any nested repo still trigger
> > do_compile/do_configure to re-run.
> >
> > Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
> > ---
> > meta/classes/externalsrc.bbclass | 37
> > +++++++++++++++++++++++++++++++-
> > 1 file changed, 36 insertions(+), 1 deletion(-)
> >
> > diff --git a/meta/classes/externalsrc.bbclass
> > b/meta/classes/externalsrc.bbclass
> > index 902ff2604f..0dd57af668 100644
> > --- a/meta/classes/externalsrc.bbclass
> > +++ b/meta/classes/externalsrc.bbclass
> > @@ -234,8 +234,43 @@ def srctree_hash_files(d, srcdir=None):
> > # Update our custom index
> > env = os.environ.copy()
> > env['GIT_INDEX_FILE'] = tmp_index.name
> > - subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir,
> env=env)
> > + # Find nested git repos created by multiple SRC_URI git
> entries with
> > + # different destsuffix values. git add -A . exits 128 when it
> encounters
> > + # these unregistered nested repos.
> > + nested_git_dirs = []
> > + for root, dirs, files in os.walk(s_dir):
> > + if root == s_dir:
> > + continue
> > + if '.git' in dirs or '.git' in files:
> > + nested_git_dirs.append(root)
> > + dirs[:] = [] # don't recurse into nested repos
>
> This os.walk() loop is expensive, is there an alternative way to handle this?
>
> The code has also become difficult to parse. My rule of thumb is that if a group
> of lines needs a leading comment, it also needs an empty line before the
> comment to visually separate things.
>
> > + if nested_git_dirs:
> > + excludes = [':(exclude)' + os.path.relpath(n, s_dir) for n
> in nested_git_dirs]
> > + subprocess.check_output(['git', 'add', '-A', '.'] + excludes,
> cwd=s_dir, env=env)
> > + else:
> > + subprocess.check_output(['git', 'add', '-A', '.'],
> > + cwd=s_dir, env=env)
>
> To simplify the code, construct a cmd variable and call
> subprocess.check_output(cmd, ...) once.
>
> > git_sha1 = subprocess.check_output(['git', 'write-tree'],
> > cwd=s_dir, env=env).decode("utf-8")
> > + # Hash each nested git repo separately so source changes
> there still
> > + # trigger do_compile/do_configure to re-run.
> > + for nested in nested_git_dirs:
> > + nested_git = os.path.join(nested, '.git')
> > + if not os.path.isdir(nested_git):
> > + continue
> > + with
> tempfile.NamedTemporaryFile(prefix='oe-devtool-nested-index') as
> nested_tmp:
> > + nested_index = os.path.join(nested_git, 'index')
> > + if os.path.exists(nested_index):
> > + shutil.copyfile(nested_index,
> nested_tmp.name)
> > + nested_env = os.environ.copy()
> > + nested_env['GIT_INDEX_FILE'] = nested_tmp.name
> > + proc = subprocess.Popen(['git', 'add', '-A', '.'],
> cwd=nested,
> > + env=nested_env,
> stdout=subprocess.DEVNULL,
> > +
> stderr=subprocess.DEVNULL)
> > + proc.communicate()
> > + proc = subprocess.Popen(['git', 'write-tree'],
> cwd=nested,
> > + env=nested_env,
> stdout=subprocess.PIPE,
> > +
> stderr=subprocess.DEVNULL)
> > + stdout, _ = proc.communicate()
> > + git_sha1 += stdout.decode("utf-8")
>
> We should re-use the code from the following block which handles submodules
> instead of re-implementing the behaviour. Perhaps the common code needs to
> be refactored out.
>
> Best regards,
>
> --
> Paul Barker
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple SRC_URI entries
2026-05-18 19:16 ` Paul Barker
2026-05-19 6:28 ` Jamin Lin
@ 2026-05-19 7:07 ` Jamin Lin
2026-05-19 9:09 ` Jamin Lin
1 sibling, 1 reply; 11+ messages in thread
From: Jamin Lin @ 2026-05-19 7:07 UTC (permalink / raw)
To: Paul Barker, openembedded-core@lists.openembedded.org
Cc: Troy Lee, Vince Chang
Hi Paul,
> Subject: Re: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple
> SRC_URI entries
>
> On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> > When a recipe uses multiple git SRC_URI entries with different
> > destsuffix values (e.g. Zephyr-based recipes with separate repos for
> > the kernel, modules, and application), each source is unpacked into a
> > subdirectory of EXTERNALSRC that retains its own .git directory.
> >
> > srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
> > which fails with exit code 128 when git encounters these unregistered
> > nested git repositories, halting the bitbake parse phase.
>
> Is this true? The documentation for `git add` [1] talks about issuing a warning
> when this occurs, not an error, and in some quick local testing I get a successful
> exit (exit code 0) when I try this.
>
> > Fix by scanning for nested git repos before the add. If any are found,
> > exclude them from the top-level 'git add' using pathspec magic
> > ':(exclude)<path>' and hash each nested repo independently using a
> > temporary index. This ensures changes in any nested repo still trigger
> > do_compile/do_configure to re-run.
> >
> > Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
> > ---
> > meta/classes/externalsrc.bbclass | 37
> > +++++++++++++++++++++++++++++++-
> > 1 file changed, 36 insertions(+), 1 deletion(-)
> >
> > diff --git a/meta/classes/externalsrc.bbclass
> > b/meta/classes/externalsrc.bbclass
> > index 902ff2604f..0dd57af668 100644
> > --- a/meta/classes/externalsrc.bbclass
> > +++ b/meta/classes/externalsrc.bbclass
> > @@ -234,8 +234,43 @@ def srctree_hash_files(d, srcdir=None):
> > # Update our custom index
> > env = os.environ.copy()
> > env['GIT_INDEX_FILE'] = tmp_index.name
> > - subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir,
> env=env)
> > + # Find nested git repos created by multiple SRC_URI git
> entries with
> > + # different destsuffix values. git add -A . exits 128 when it
> encounters
> > + # these unregistered nested repos.
> > + nested_git_dirs = []
> > + for root, dirs, files in os.walk(s_dir):
> > + if root == s_dir:
> > + continue
> > + if '.git' in dirs or '.git' in files:
> > + nested_git_dirs.append(root)
> > + dirs[:] = [] # don't recurse into nested repos
>
> This os.walk() loop is expensive, is there an alternative way to handle this?
>
os.scandir() was considered but rejected: destsuffix allows arbitrarily
deep paths (e.g. destsuffix=${P}/modules/tee/tf-m/trusted-firmware-m), so a depth-1 scan would
silently miss nested repos and stop hashing their content — source
changes there would no longer trigger do_compile to re-run.
The os.walk() loop uses dirs[:] = [] to stop recursing as soon as a
.git entry is found, so we never descend into the nested repos
themselves (which may contain tens of thousands of files). The walk only
traverses the shallow skeleton of intermediate directories between
EXTERNALSRC and each nested repo.
Please see the use case here:
https://git.yoctoproject.org/meta-zephyr/tree/meta-zephyr-core/recipes-kernel/zephyr-kernel/zephyr-kernel-src-4.3.0.inc
${SRC_URI_ZEPHYR_OPEN_AMP};name=open-amp;nobranch=1;destsuffix=${P}/modules/lib/open-amp \
${SRC_URI_ZEPHYR_TRUSTED_FIRMWARE_M};name=trusted-firmware-m;nobranch=1;destsuffix=${P}/modules/tee/tf-m/trusted-firmware-m \
Thanks-Jamin
> The code has also become difficult to parse. My rule of thumb is that if a group
> of lines needs a leading comment, it also needs an empty line before the
> comment to visually separate things.
>
> > + if nested_git_dirs:
> > + excludes = [':(exclude)' + os.path.relpath(n, s_dir) for n
> in nested_git_dirs]
> > + subprocess.check_output(['git', 'add', '-A', '.'] + excludes,
> cwd=s_dir, env=env)
> > + else:
> > + subprocess.check_output(['git', 'add', '-A', '.'],
> > + cwd=s_dir, env=env)
>
> To simplify the code, construct a cmd variable and call
> subprocess.check_output(cmd, ...) once.
>
> > git_sha1 = subprocess.check_output(['git', 'write-tree'],
> > cwd=s_dir, env=env).decode("utf-8")
> > + # Hash each nested git repo separately so source changes
> there still
> > + # trigger do_compile/do_configure to re-run.
> > + for nested in nested_git_dirs:
> > + nested_git = os.path.join(nested, '.git')
> > + if not os.path.isdir(nested_git):
> > + continue
> > + with
> tempfile.NamedTemporaryFile(prefix='oe-devtool-nested-index') as
> nested_tmp:
> > + nested_index = os.path.join(nested_git, 'index')
> > + if os.path.exists(nested_index):
> > + shutil.copyfile(nested_index,
> nested_tmp.name)
> > + nested_env = os.environ.copy()
> > + nested_env['GIT_INDEX_FILE'] = nested_tmp.name
> > + proc = subprocess.Popen(['git', 'add', '-A', '.'],
> cwd=nested,
> > + env=nested_env,
> stdout=subprocess.DEVNULL,
> > +
> stderr=subprocess.DEVNULL)
> > + proc.communicate()
> > + proc = subprocess.Popen(['git', 'write-tree'],
> cwd=nested,
> > + env=nested_env,
> stdout=subprocess.PIPE,
> > +
> stderr=subprocess.DEVNULL)
> > + stdout, _ = proc.communicate()
> > + git_sha1 += stdout.decode("utf-8")
>
> We should re-use the code from the following block which handles submodules
> instead of re-implementing the behaviour. Perhaps the common code needs to
> be refactored out.
>
> Best regards,
>
> --
> Paul Barker
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple SRC_URI entries
2026-05-19 7:07 ` Jamin Lin
@ 2026-05-19 9:09 ` Jamin Lin
0 siblings, 0 replies; 11+ messages in thread
From: Jamin Lin @ 2026-05-19 9:09 UTC (permalink / raw)
To: Paul Barker, openembedded-core@lists.openembedded.org
Cc: Troy Lee, Vince Chang
Hi Paul,
> Subject: RE: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple
> SRC_URI entries
>
> Hi Paul,
>
> > Subject: Re: [PATCH v1 1/2] externalsrc: Handle nested git repos from
> > multiple SRC_URI entries
> >
> > On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> > > When a recipe uses multiple git SRC_URI entries with different
> > > destsuffix values (e.g. Zephyr-based recipes with separate repos for
> > > the kernel, modules, and application), each source is unpacked into
> > > a subdirectory of EXTERNALSRC that retains its own .git directory.
> > >
> > > srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
> > > which fails with exit code 128 when git encounters these
> > > unregistered nested git repositories, halting the bitbake parse phase.
> >
> > Is this true? The documentation for `git add` [1] talks about issuing
> > a warning when this occurs, not an error, and in some quick local
> > testing I get a successful exit (exit code 0) when I try this.
> >
> > > Fix by scanning for nested git repos before the add. If any are
> > > found, exclude them from the top-level 'git add' using pathspec
> > > magic ':(exclude)<path>' and hash each nested repo independently
> > > using a temporary index. This ensures changes in any nested repo
> > > still trigger do_compile/do_configure to re-run.
> > >
> > > Signed-off-by: Jamin Lin <jamin_lin@aspeedtech.com>
> > > ---
> > > meta/classes/externalsrc.bbclass | 37
> > > +++++++++++++++++++++++++++++++-
> > > 1 file changed, 36 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/meta/classes/externalsrc.bbclass
> > > b/meta/classes/externalsrc.bbclass
> > > index 902ff2604f..0dd57af668 100644
> > > --- a/meta/classes/externalsrc.bbclass
> > > +++ b/meta/classes/externalsrc.bbclass
> > > @@ -234,8 +234,43 @@ def srctree_hash_files(d, srcdir=None):
> > > # Update our custom index
> > > env = os.environ.copy()
> > > env['GIT_INDEX_FILE'] = tmp_index.name
> > > - subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir,
> > env=env)
> > > + # Find nested git repos created by multiple SRC_URI git
> > entries with
> > > + # different destsuffix values. git add -A . exits 128
> > > + when it
> > encounters
> > > + # these unregistered nested repos.
> > > + nested_git_dirs = []
> > > + for root, dirs, files in os.walk(s_dir):
> > > + if root == s_dir:
> > > + continue
> > > + if '.git' in dirs or '.git' in files:
> > > + nested_git_dirs.append(root)
> > > + dirs[:] = [] # don't recurse into nested repos
> >
> > This os.walk() loop is expensive, is there an alternative way to handle this?
> >
> os.scandir() was considered but rejected: destsuffix allows arbitrarily deep
> paths (e.g. destsuffix=${P}/modules/tee/tf-m/trusted-firmware-m), so a
> depth-1 scan would silently miss nested repos and stop hashing their content —
> source changes there would no longer trigger do_compile to re-run.
>
> The os.walk() loop uses dirs[:] = [] to stop recursing as soon as a .git entry is
> found, so we never descend into the nested repos themselves (which may
> contain tens of thousands of files). The walk only traverses the shallow skeleton
> of intermediate directories between EXTERNALSRC and each nested repo.
>
> Please see the use case here:
> https://git.yoctoproject.org/meta-zephyr/tree/meta-zephyr-core/recipes-kernel
> /zephyr-kernel/zephyr-kernel-src-4.3.0.inc
>
> ${SRC_URI_ZEPHYR_OPEN_AMP};name=open-amp;nobranch=1;destsuffix=${P}
> /modules/lib/open-amp \
> ${SRC_URI_ZEPHYR_TRUSTED_FIRMWARE_M};name=trusted-firmware-m;nob
> ranch=1;destsuffix=${P}/modules/tee/tf-m/trusted-firmware-m \
>
> Thanks-Jamin
>
> > The code has also become difficult to parse. My rule of thumb is that
> > if a group of lines needs a leading comment, it also needs an empty
> > line before the comment to visually separate things.
> >
> > > + if nested_git_dirs:
> > > + excludes = [':(exclude)' + os.path.relpath(n,
> > > + s_dir) for n
> > in nested_git_dirs]
> > > + subprocess.check_output(['git', 'add', '-A', '.'] +
> > > + excludes,
> > cwd=s_dir, env=env)
> > > + else:
> > > + subprocess.check_output(['git', 'add', '-A', '.'],
> > > + cwd=s_dir, env=env)
> >
> > To simplify the code, construct a cmd variable and call
> > subprocess.check_output(cmd, ...) once.
> >
> > > git_sha1 = subprocess.check_output(['git',
> > > 'write-tree'], cwd=s_dir, env=env).decode("utf-8")
> > > + # Hash each nested git repo separately so source
> > > + changes
> > there still
> > > + # trigger do_compile/do_configure to re-run.
> > > + for nested in nested_git_dirs:
> > > + nested_git = os.path.join(nested, '.git')
> > > + if not os.path.isdir(nested_git):
> > > + continue
> > > + with
> > tempfile.NamedTemporaryFile(prefix='oe-devtool-nested-index') as
> > nested_tmp:
> > > + nested_index = os.path.join(nested_git, 'index')
> > > + if os.path.exists(nested_index):
> > > + shutil.copyfile(nested_index,
> > nested_tmp.name)
> > > + nested_env = os.environ.copy()
> > > + nested_env['GIT_INDEX_FILE'] =
> nested_tmp.name
> > > + proc = subprocess.Popen(['git', 'add', '-A',
> > > + '.'],
> > cwd=nested,
> > > + env=nested_env,
> > stdout=subprocess.DEVNULL,
> > > +
> > stderr=subprocess.DEVNULL)
> > > + proc.communicate()
> > > + proc = subprocess.Popen(['git', 'write-tree'],
> > cwd=nested,
> > > + env=nested_env,
> > stdout=subprocess.PIPE,
> > > +
> > stderr=subprocess.DEVNULL)
> > > + stdout, _ = proc.communicate()
> > > + git_sha1 += stdout.decode("utf-8")
> >
> > We should re-use the code from the following block which handles
> > submodules instead of re-implementing the behaviour. Perhaps the
> > common code needs to be refactored out.
> >
> > Best regards,
> >
> > --
> > Paul Barker
Thanks for the review and suggestion.
Will send v2 as below.
```
diff --git a/meta/classes/externalsrc.bbclass b/meta/classes/externalsrc.bbclass
index 902ff2604f..f2e0812eea 100644
--- a/meta/classes/externalsrc.bbclass
+++ b/meta/classes/externalsrc.bbclass
@@ -234,18 +234,47 @@ def srctree_hash_files(d, srcdir=None):
# Update our custom index
env = os.environ.copy()
env['GIT_INDEX_FILE'] = tmp_index.name
- subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir, env=env)
+
+ # Find nested git repos created by multiple SRC_URI git entries with
+ # different destsuffix values. When GIT_INDEX_FILE is set, git add -A .
+ # exits 128 instead of warning when it encounters these unregistered
+ # nested repos, halting the bitbake parse phase.
+ nested_git_dirs = []
+ for root, dirs, files in os.walk(s_dir):
+ if root == s_dir:
+ continue
+ if '.git' in dirs or '.git' in files:
+ nested_git_dirs.append(root)
+ dirs[:] = []
+
+ cmd = ['git', 'add', '-A', '.']
+ if nested_git_dirs:
+ cmd += [':(exclude)' + os.path.relpath(n, s_dir) for n in nested_git_dirs]
+ subprocess.check_output(cmd, cwd=s_dir, env=env)
+
git_sha1 = subprocess.check_output(['git', 'write-tree'], cwd=s_dir, env=env).decode("utf-8")
- if os.path.exists(os.path.join(s_dir, ".gitmodules")) and os.path.getsize(os.path.join(s_dir, ".gitmodules")) > 0:
+
+ # Hash nested git repos and submodules together so changes in any of
+ # them still trigger do_compile/do_configure to re-run.
+ subdirs_to_hash = list(nested_git_dirs)
+ if os.path.exists(os.path.join(s_dir, ".gitmodules")) and \
+ os.path.getsize(os.path.join(s_dir, ".gitmodules")) > 0:
submodule_helper = subprocess.check_output(["git", "config", "--file", ".gitmodules", "--get-regexp", "path"], cwd=s_dir, env=env).decode("utf-8")
for line in submodule_helper.splitlines():
module_dir = os.path.join(s_dir, line.rsplit(maxsplit=1)[1])
if os.path.isdir(module_dir):
- proc = subprocess.Popen(['git', 'add', '-A', '.'], cwd=module_dir, env=env, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
- proc.communicate()
- proc = subprocess.Popen(['git', 'write-tree'], cwd=module_dir, env=env, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
- stdout, _ = proc.communicate()
- git_sha1 += stdout.decode("utf-8")
+ subdirs_to_hash.append(module_dir)
+
+ for subdir in subdirs_to_hash:
+ proc = subprocess.Popen(['git', 'add', '-A', '.'], cwd=subdir,
+ env=env, stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL)
+ proc.communicate()
+ proc = subprocess.Popen(['git', 'write-tree'], cwd=subdir,
+ env=env, stdout=subprocess.PIPE,
+ stderr=subprocess.DEVNULL)
+ stdout, _ = proc.communicate()
+ git_sha1 += stdout.decode("utf-8")
sha1 = hashlib.sha1(git_sha1.encode("utf-8")).hexdigest()
with open(oe_hash_file, 'w') as fobj:
fobj.write(sha1)
```
Thanks,
Jamin
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [OE-core] [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple SRC_URI entries
2026-05-19 6:28 ` Jamin Lin
@ 2026-05-19 10:04 ` Paul Barker
2026-05-20 3:25 ` Jamin Lin
0 siblings, 1 reply; 11+ messages in thread
From: Paul Barker @ 2026-05-19 10:04 UTC (permalink / raw)
To: jamin_lin, openembedded-core@lists.openembedded.org; +Cc: Troy Lee, Vince Chang
[-- Attachment #1: Type: text/plain, Size: 4796 bytes --]
On Tue, 2026-05-19 at 06:28 +0000, Jamin Lin via lists.openembedded.org
wrote:
> Hi Paul,
>
> > Subject: Re: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple
> > SRC_URI entries
> >
> > On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> > > When a recipe uses multiple git SRC_URI entries with different
> > > destsuffix values (e.g. Zephyr-based recipes with separate repos for
> > > the kernel, modules, and application), each source is unpacked into a
> > > subdirectory of EXTERNALSRC that retains its own .git directory.
> > >
> > > srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
> > > which fails with exit code 128 when git encounters these unregistered
> > > nested git repositories, halting the bitbake parse phase.
> >
> > Is this true? The documentation for `git add` [1] talks about issuing a warning
> > when this occurs, not an error, and in some quick local testing I get a successful
> > exit (exit code 0) when I try this.
> >
>
> Please try the "zephyr-helloworld" package.
>
> This recipe uses multiple Git repositories in "SRC_URI", and I can reproduce the issue with this package:
>
> https://git.yoctoproject.org/meta-zephyr/tree/meta-zephyr-core/recipes-kernel/zephyr-kernel/zephyr-kernel-src-4.3.0.inc
>
> Steps to reproduce:
>
> 1. git clone https://github.com/openembedded/bitbake.git
> 2. git clone https://github.com/openembedded/openembedded-core.git
> 3. git clone https://git.yoctoproject.org/meta-zephyr
> 4. git clone https://github.com/openembedded/meta-openembedded.git
> 5. source openembedded-core/oe-init-build-env build
> 6. Edit build/conf/bblayers.conf and add the following layers:
>
> * meta-openembedded/meta-python
> * meta-zephyr/meta-zephyr-core
> * meta-openembedded/meta-oe
> 7. devtool modify zephyr-helloworld
> 8. bitbake -c cleanall zephyr-helloworld
> 9. bitbake zephyr-helloworld
>
> After that, the following parser error occurs, and the package can no longer be built:
>
> ```text
> jamin_lin@aspeed-fw02:~/oe-review/build$ bitbake zephyr-helloworld
> Loading cache: 100% |#########################################################################################################################################################################################################| Time: 0:00:01
> Loaded 4949 entries from dependency cache.
> ERROR: ExpansionError during parsing /home/jamin_lin/oe-review/meta-zephyr/meta-zephyr-core/recipes-kernel/zephyr-kernel/zephyr-helloworld.bb
> bb.data_smart.ExpansionError: Failure expanding variable do_compile[file-checksums], expression was ${@srctree_hash_files(d)} which triggered exception CalledProcessError: Command '['git', 'add', '-A', '.']' returned non-zero exit status 128.
> The variable dependency chain for the failure is: do_compile[file-checksums]
>
> ERROR: Parsing halted due to errors, see error messages above
> Parsing recipes: 0% | | ETA: --:--:--
> Summary: There were 2 ERROR messages, returning a non-zero exit code.
> ```
Following these steps I see the same error. But the cause is not the
nested git repositories. `bitbake -c cleanall ...` has deleted the
sources from the downloads directory, but the devtool workspace is still
referencing the deleted sources. Even a `git status` command in the
zephyr-helloworld sources directory fails:
$ git -C workspace/sources/zephyr-helloworld status
error: unable to normalize alternate object path: /home/pbarker/bitbake-builds/.bitbake-setup-downloads/git2/github.com.zephyrproject-rtos.mcuboot//objects
error: unable to normalize alternate object path: /home/pbarker/bitbake-builds/.bitbake-setup-downloads/git2/github.com.zephyrproject-rtos.mcuboot//objects
error: unable to normalize alternate object path: /home/pbarker/bitbake-builds/.bitbake-setup-downloads/git2/github.com.zephyrproject-rtos.mcuboot//objects
error: unable to normalize alternate object path: /home/pbarker/bitbake-builds/.bitbake-setup-downloads/git2/github.com.zephyrproject-rtos.mcuboot//objects
fatal: bad object HEAD
fatal: 'git status --porcelain=2' failed in submodule bootloader/mcuboot
If I replace step 8 with `bitbake -c cleansstate ...` then there is no
parse error.
What we need to fix is the interaction between `devtool modify` and
`bitbake -c cleanall`. Perhaps devtool needs to detect this broken state
and tell the user how to fix it - simply running `bitbake -c fetch
zephyr-helloworld` isn't possible as the metadata can't be parsed.
Best regards,
--
Paul Barker
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 252 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: [OE-core] [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple SRC_URI entries
2026-05-19 10:04 ` [OE-core] " Paul Barker
@ 2026-05-20 3:25 ` Jamin Lin
2026-05-20 5:24 ` Jamin Lin
0 siblings, 1 reply; 11+ messages in thread
From: Jamin Lin @ 2026-05-20 3:25 UTC (permalink / raw)
To: Paul Barker, openembedded-core@lists.openembedded.org
Cc: Troy Lee, Vince Chang
Hi Paul,
> Subject: Re: [OE-core] [PATCH v1 1/2] externalsrc: Handle nested git repos from
> multiple SRC_URI entries
>
> On Tue, 2026-05-19 at 06:28 +0000, Jamin Lin via lists.openembedded.org
> wrote:
> > Hi Paul,
> >
> > > Subject: Re: [PATCH v1 1/2] externalsrc: Handle nested git repos
> > > from multiple SRC_URI entries
> > >
> > > On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> > > > When a recipe uses multiple git SRC_URI entries with different
> > > > destsuffix values (e.g. Zephyr-based recipes with separate repos
> > > > for the kernel, modules, and application), each source is unpacked
> > > > into a subdirectory of EXTERNALSRC that retains its own .git directory.
> > > >
> > > > srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
> > > > which fails with exit code 128 when git encounters these
> > > > unregistered nested git repositories, halting the bitbake parse phase.
> > >
> > > Is this true? The documentation for `git add` [1] talks about
> > > issuing a warning when this occurs, not an error, and in some quick
> > > local testing I get a successful exit (exit code 0) when I try this.
> > >
> >
> > Please try the "zephyr-helloworld" package.
> >
> > This recipe uses multiple Git repositories in "SRC_URI", and I can reproduce
> the issue with this package:
> >
> > https://git.yoctoproject.org/meta-zephyr/tree/meta-zephyr-core/recipes
> > -kernel/zephyr-kernel/zephyr-kernel-src-4.3.0.inc
> >
> > Steps to reproduce:
> >
> > 1. git clone https://github.com/openembedded/bitbake.git
> > 2. git clone https://github.com/openembedded/openembedded-core.git
> > 3. git clone https://git.yoctoproject.org/meta-zephyr
> > 4. git clone https://github.com/openembedded/meta-openembedded.git
> > 5. source openembedded-core/oe-init-build-env build 6. Edit
> > build/conf/bblayers.conf and add the following layers:
> >
> > * meta-openembedded/meta-python
> > * meta-zephyr/meta-zephyr-core
> > * meta-openembedded/meta-oe
> > 7. devtool modify zephyr-helloworld
> > 8. bitbake -c cleanall zephyr-helloworld 9. bitbake zephyr-helloworld
> >
> > After that, the following parser error occurs, and the package can no longer
> be built:
> >
> > ```text
> > jamin_lin@aspeed-fw02:~/oe-review/build$ bitbake zephyr-helloworld
> > Loading cache: 100%
> >
> |###############################################################
> ################################################################
> ################################################################
> ##########| Time: 0:00:01 Loaded 4949 entries from dependency cache.
> > ERROR: ExpansionError during parsing
> > /home/jamin_lin/oe-review/meta-zephyr/meta-zephyr-core/recipes-kernel/
> > zephyr-kernel/zephyr-helloworld.bb
> > bb.data_smart.ExpansionError: Failure expanding variable
> do_compile[file-checksums], expression was ${@srctree_hash_files(d)} which
> triggered exception CalledProcessError: Command '['git', 'add', '-A', '.']'
> returned non-zero exit status 128.
> > The variable dependency chain for the failure is:
> > do_compile[file-checksums]
> >
> > ERROR: Parsing halted due to errors, see error messages above
> > Parsing recipes: 0% |
> | ETA: --:--:--
> > Summary: There were 2 ERROR messages, returning a non-zero exit code.
> > ```
>
> Following these steps I see the same error. But the cause is not the nested git
> repositories. `bitbake -c cleanall ...` has deleted the sources from the
> downloads directory, but the devtool workspace is still referencing the deleted
> sources. Even a `git status` command in the zephyr-helloworld sources
> directory fails:
>
> $ git -C workspace/sources/zephyr-helloworld status
> error: unable to normalize alternate object path:
> /home/pbarker/bitbake-builds/.bitbake-setup-downloads/git2/github.com.zeph
> yrproject-rtos.mcuboot//objects
> error: unable to normalize alternate object path:
> /home/pbarker/bitbake-builds/.bitbake-setup-downloads/git2/github.com.zeph
> yrproject-rtos.mcuboot//objects
> error: unable to normalize alternate object path:
> /home/pbarker/bitbake-builds/.bitbake-setup-downloads/git2/github.com.zeph
> yrproject-rtos.mcuboot//objects
> error: unable to normalize alternate object path:
> /home/pbarker/bitbake-builds/.bitbake-setup-downloads/git2/github.com.zeph
> yrproject-rtos.mcuboot//objects
> fatal: bad object HEAD
> fatal: 'git status --porcelain=2' failed in submodule bootloader/mcuboot
>
> If I replace step 8 with `bitbake -c cleansstate ...` then there is no parse error.
>
> What we need to fix is the interaction between `devtool modify` and `bitbake
> -c cleanall`. Perhaps devtool needs to detect this broken state and tell the user
> how to fix it - simply running `bitbake -c fetch zephyr-helloworld` isn't possible
> as the metadata can't be parsed.
>
> Best regards,
>
> --
> Paul Barker
Hi Paul,
Thanks for identifying the root cause.
After studying the devtool design, I now understand what happens: when devtool modify creates the workspace, it does not copy the full git object store locally.
Instead, it clones the source tree using git's alternate object store mechanism - the workspace's .git/objects/info/alternates file points back to the downloaded bare repositories under the downloads/ directory.
This way the workspace shares the git objects with the download cache rather than duplicating them on disk.
When bitbake -c cleanall is run, it deletes the downloaded bare repositories from downloads/. The workspace source tree still exists, but its alternates now point to missing paths.
Any git operation on the workspace (including git status, git add, etc.) then fails because git cannot resolve the object references, producing errors like:
error: unable to normalize alternate object path: .../downloads/git2/.../objects
fatal: bad object HEAD
This is what causes the parser to halt with exit 128.
The fix on my side is straightforward: replace bitbake -c cleanall with bitbake -c cleansstate in my devtool workflow. cleansstate clears the sstate cache without touching the downloads directory, so the workspace alternates remain intact.
That said, I am not sure whether some improvement on the OE side would still be worthwhile. Currently, when this breakage occurs, the user sees only a cryptic CalledProcessError: exit 128 with no indication of what went wrong or how to recover.
The situation is also not obvious: running bitbake -c cleanall on a single-repo recipe with a devtool workspace does not trigger this problem, so users working with multi-repo recipes (such as Zephyr-based recipes that fetch multiple git repositories into separate destsuffix directories) are likely to hit this unexpectedly. A clearer error message pointing to the root cause and the recovery steps (devtool reset -n followed by devtool modify) might save other users from the same confusion.
Please drop this patch series because it is not a normal test case for devtool.
Below patch that may help improve the user experience, though I leave it to the community to decide whether they are worth reviewing:
Best regards,
Jamin
diff --git a/meta/classes/externalsrc.bbclass b/meta/classes/externalsrc.bbclass
index 902ff2604f..3e6c9d937e 100644
--- a/meta/classes/externalsrc.bbclass
+++ b/meta/classes/externalsrc.bbclass
@@ -227,6 +227,57 @@ def srctree_hash_files(d, srcdir=None):
ret = " "
if git_dir is not None:
+ # Find nested git repos created by multiple SRC_URI git entries with
+ # different destsuffix values, so their alternates can also be checked
+ # for breakage below.
+ nested_git_dirs = []
+ for root, dirs, files in os.walk(s_dir):
+ if root == s_dir:
+ continue
+ if '.git' in dirs or '.git' in files:
+ nested_git_dirs.append(root)
+ dirs[:] = []
+
+ # Check for broken git alternates in the top-level repo and all nested
+ # repos. This can happen when 'bitbake -c cleanall' deletes the
+ # downloads directory that the devtool workspace references via
+ # alternates, causing all subsequent git operations to exit 128 and
+ # halt the parse phase. Relative alternate paths are resolved against
+ # the repo's objects directory; double-slash paths (a known devtool
+ # artefact) are normalised before the existence check.
+ def has_broken_alternates(dot_git):
+ obj_dir = os.path.join(dot_git, 'objects')
+ alt_file = os.path.join(obj_dir, 'info', 'alternates')
+ if not os.path.exists(alt_file):
+ return False
+ with open(alt_file) as f:
+ for line in f:
+ path = line.strip()
+ if not path or path.startswith('#'):
+ continue
+ if not os.path.isabs(path):
+ path = os.path.join(obj_dir, path)
+ if not os.path.exists(os.path.normpath(path)):
+ return True
+ return False
+
+ broken = has_broken_alternates(git_dir)
+ if not broken:
+ for nested in nested_git_dirs:
+ nested_dot_git = os.path.join(nested, '.git')
+ if os.path.isdir(nested_dot_git) and has_broken_alternates(nested_dot_git):
+ broken = True
+ break
+
+ if broken:
+ bb.warn('%s: devtool workspace has broken git alternates, '
+ 'likely caused by "bitbake -c cleanall" removing the '
+ 'downloads directory. '
+ 'To repair the workspace, run: '
+ 'devtool reset %s && devtool modify %s'
+ % (d.getVar('PN'), d.getVar('PN'), d.getVar('PN')))
+ return s_dir + '/*:True'
+
^ permalink raw reply related [flat|nested] 11+ messages in thread
* RE: [OE-core] [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple SRC_URI entries
2026-05-20 3:25 ` Jamin Lin
@ 2026-05-20 5:24 ` Jamin Lin
0 siblings, 0 replies; 11+ messages in thread
From: Jamin Lin @ 2026-05-20 5:24 UTC (permalink / raw)
To: Paul Barker, openembedded-core@lists.openembedded.org
Cc: Troy Lee, Vince Chang
Hi Paul
> Hi Paul,
>
> Thanks for identifying the root cause.
>
> After studying the devtool design, I now understand what happens: when
> devtool modify creates the workspace, it does not copy the full git object store
> locally.
> Instead, it clones the source tree using git's alternate object store mechanism -
> the workspace's .git/objects/info/alternates file points back to the downloaded
> bare repositories under the downloads/ directory.
> This way the workspace shares the git objects with the download cache rather
> than duplicating them on disk.
>
> When bitbake -c cleanall is run, it deletes the downloaded bare repositories
> from downloads/. The workspace source tree still exists, but its alternates now
> point to missing paths.
> Any git operation on the workspace (including git status, git add, etc.) then
> fails because git cannot resolve the object references, producing errors like:
>
> error: unable to normalize alternate object path: .../downloads/git2/.../objects
> fatal: bad object HEAD
>
> This is what causes the parser to halt with exit 128.
>
> The fix on my side is straightforward: replace bitbake -c cleanall with bitbake
> -c cleansstate in my devtool workflow. cleansstate clears the sstate cache
> without touching the downloads directory, so the workspace alternates remain
> intact.
>
> That said, I am not sure whether some improvement on the OE side would still
> be worthwhile. Currently, when this breakage occurs, the user sees only a
> cryptic CalledProcessError: exit 128 with no indication of what went wrong or
> how to recover.
> The situation is also not obvious: running bitbake -c cleanall on a single-repo
> recipe with a devtool workspace does not trigger this problem, so users
> working with multi-repo recipes (such as Zephyr-based recipes that fetch
> multiple git repositories into separate destsuffix directories) are likely to hit
> this unexpectedly. A clearer error message pointing to the root cause and the
> recovery steps (devtool reset -n followed by devtool modify) might save other
> users from the same confusion.
>
> Please drop this patch series because it is not a normal test case for devtool.
>
> Below patch that may help improve the user experience, though I leave it to
> the community to decide whether they are worth reviewing:
>
> Best regards,
> Jamin
>
> diff --git a/meta/classes/externalsrc.bbclass
> b/meta/classes/externalsrc.bbclass
> index 902ff2604f..3e6c9d937e 100644
> --- a/meta/classes/externalsrc.bbclass
> +++ b/meta/classes/externalsrc.bbclass
> @@ -227,6 +227,57 @@ def srctree_hash_files(d, srcdir=None):
>
> ret = " "
> if git_dir is not None:
> + # Find nested git repos created by multiple SRC_URI git entries with
> + # different destsuffix values, so their alternates can also be checked
> + # for breakage below.
> + nested_git_dirs = []
> + for root, dirs, files in os.walk(s_dir):
> + if root == s_dir:
> + continue
> + if '.git' in dirs or '.git' in files:
> + nested_git_dirs.append(root)
> + dirs[:] = []
> +
> + # Check for broken git alternates in the top-level repo and all
> nested
> + # repos. This can happen when 'bitbake -c cleanall' deletes the
> + # downloads directory that the devtool workspace references via
> + # alternates, causing all subsequent git operations to exit 128 and
> + # halt the parse phase. Relative alternate paths are resolved
> against
> + # the repo's objects directory; double-slash paths (a known devtool
> + # artefact) are normalised before the existence check.
> + def has_broken_alternates(dot_git):
> + obj_dir = os.path.join(dot_git, 'objects')
> + alt_file = os.path.join(obj_dir, 'info', 'alternates')
> + if not os.path.exists(alt_file):
> + return False
> + with open(alt_file) as f:
> + for line in f:
> + path = line.strip()
> + if not path or path.startswith('#'):
> + continue
> + if not os.path.isabs(path):
> + path = os.path.join(obj_dir, path)
> + if not os.path.exists(os.path.normpath(path)):
> + return True
> + return False
> +
> + broken = has_broken_alternates(git_dir)
> + if not broken:
> + for nested in nested_git_dirs:
> + nested_dot_git = os.path.join(nested, '.git')
> + if os.path.isdir(nested_dot_git) and
> has_broken_alternates(nested_dot_git):
> + broken = True
> + break
> +
> + if broken:
> + bb.warn('%s: devtool workspace has broken git alternates, '
> + 'likely caused by "bitbake -c cleanall" removing the '
> + 'downloads directory. '
> + 'To repair the workspace, run: '
> + 'devtool reset %s && devtool modify %s'
> + % (d.getVar('PN'), d.getVar('PN'), d.getVar('PN')))
> + return s_dir + '/*:True'
> +
>
Please ignore the comments above.
I believe I have identified the root cause. Please see my updated solution here:
https://patchwork.yoctoproject.org/project/oe-core/patch/20260520051800.1951624-1-jamin_lin@aspeedtech.com/
Thanks,
Jamin
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-20 5:24 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15 9:36 [PATCH v1 0/2] oe: Fix build failures with multiple git SRC_URI entries Jamin Lin
2026-05-15 9:36 ` [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple " Jamin Lin
2026-05-18 19:16 ` Paul Barker
2026-05-19 6:28 ` Jamin Lin
2026-05-19 10:04 ` [OE-core] " Paul Barker
2026-05-20 3:25 ` Jamin Lin
2026-05-20 5:24 ` Jamin Lin
2026-05-19 7:07 ` Jamin Lin
2026-05-19 9:09 ` Jamin Lin
2026-05-15 9:36 ` [PATCH v1 2/2] reproducible: Handle nested git repos in find_git_repositories Jamin Lin
2026-05-18 19:22 ` Paul Barker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox