* [scarthgap][PATCH 0/3] Fix git lfs submodule expansion
@ 2026-03-09 21:21 Michael Siebold
2026-03-09 21:21 ` [PATCH 1/3] bitbake: gitsm: Add clean function Michael Siebold
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Michael Siebold @ 2026-03-09 21:21 UTC (permalink / raw)
To: Yoann Congal; +Cc: bitbake-devel, Michael Siebold
These commits are required to avoid trouble when a submodule includes
large files via git lfs.
Testing: Verified via bitbake with manual inspection of artifacts
Philip Lorenz (2):
bitbake: fetch2: Fix incorrect lfs parametrization for submodules
bitbake: fetch2: Fix LFS object checkout in submodules
Robert Yang (1):
bitbake: gitsm: Add clean function
bitbake/lib/bb/fetch2/gitsm.py | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH 1/3] bitbake: gitsm: Add clean function 2026-03-09 21:21 [scarthgap][PATCH 0/3] Fix git lfs submodule expansion Michael Siebold @ 2026-03-09 21:21 ` Michael Siebold 2026-03-09 21:21 ` [PATCH 2/3] bitbake: fetch2: Fix incorrect lfs parametrization for submodules Michael Siebold 2026-03-09 21:21 ` [PATCH 3/3] bitbake: fetch2: Fix LFS object checkout in submodules Michael Siebold 2 siblings, 0 replies; 6+ messages in thread From: Michael Siebold @ 2026-03-09 21:21 UTC (permalink / raw) To: Yoann Congal; +Cc: bitbake-devel, Robert Yang, Richard Purdie, Michael Siebold From: Robert Yang <liezhi.yang@windriver.com> Fixed: $ bitbake utfcpp -cfetch && bitbake utfcpp -ccleanall The downloads/git2/github.com.nemtrif.ftest won't be cleaned without this fix. (Bitbake rev: 79f25fc5c1b8d0e08540f4aa07875309f5325f47) Upstream-Status: Backport [from commit 5bce38fbae] Signed-off-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> (cherry picked from commit 5bce38fbaea0a7d1228740e2cb313957c914cfdf) Signed-off-by: Michael Siebold <michael.siebold@gmail.com> --- bitbake/lib/bb/fetch2/gitsm.py | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/bitbake/lib/bb/fetch2/gitsm.py b/bitbake/lib/bb/fetch2/gitsm.py index fab4b1164c..ba62517f08 100644 --- a/bitbake/lib/bb/fetch2/gitsm.py +++ b/bitbake/lib/bb/fetch2/gitsm.py @@ -249,6 +249,19 @@ class GitSM(Git): # should also be skipped as these files were already smudged in the fetch stage if lfs # was enabled. runfetchcmd("GIT_LFS_SKIP_SMUDGE=1 %s submodule update --recursive --no-fetch" % (ud.basecmd), d, quiet=True, workdir=ud.destdir) + def clean(self, ud, d): + def clean_submodule(ud, url, module, modpath, workdir, d): + url += ";bareclone=1;nobranch=1" + try: + newfetch = Fetch([url], d, cache=False) + newfetch.clean() + except Exception as e: + logger.warning('gitsm: submodule clean failed: %s %s' % (type(e).__name__, str(e))) + + self.call_process_submodules(ud, d, True, clean_submodule) + + # Clean top git dir + Git.clean(self, ud, d) def implicit_urldata(self, ud, d): import shutil, subprocess, tempfile -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/3] bitbake: fetch2: Fix incorrect lfs parametrization for submodules 2026-03-09 21:21 [scarthgap][PATCH 0/3] Fix git lfs submodule expansion Michael Siebold 2026-03-09 21:21 ` [PATCH 1/3] bitbake: gitsm: Add clean function Michael Siebold @ 2026-03-09 21:21 ` Michael Siebold 2026-03-09 21:21 ` [PATCH 3/3] bitbake: fetch2: Fix LFS object checkout in submodules Michael Siebold 2 siblings, 0 replies; 6+ messages in thread From: Michael Siebold @ 2026-03-09 21:21 UTC (permalink / raw) To: Yoann Congal Cc: bitbake-devel, Philip Lorenz, Richard Purdie, Michael Siebold From: Philip Lorenz <philip.lorenz@bmw.de> The existing code would pass `True` or `False` to the git fetcher. As the fetcher expects `lfs` to be set to `1` this always lead to LFS fetching being disabled. (Bitbake rev: 5e487a5a096400271ed1e29b0df72903f2304e49) Upstream-Status: Backport [from commit eb6d89e9e6] Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> (cherry picked from commit eb6d89e9e6f27b683da6f2ba2227707a965a0094) Signed-off-by: Michael Siebold <michael.siebold@gmail.com> --- bitbake/lib/bb/fetch2/gitsm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bitbake/lib/bb/fetch2/gitsm.py b/bitbake/lib/bb/fetch2/gitsm.py index ba62517f08..5c98991480 100644 --- a/bitbake/lib/bb/fetch2/gitsm.py +++ b/bitbake/lib/bb/fetch2/gitsm.py @@ -123,7 +123,7 @@ class GitSM(Git): url += ";name=%s" % module url += ";subpath=%s" % module url += ";nobranch=1" - url += ";lfs=%s" % self._need_lfs(ud) + url += ";lfs=%s" % ("1" if self._need_lfs(ud) else "0") # Note that adding "user=" here to give credentials to the # submodule is not supported. Since using SRC_URI to give git:// # URL a password is not supported, one have to use one of the -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/3] bitbake: fetch2: Fix LFS object checkout in submodules 2026-03-09 21:21 [scarthgap][PATCH 0/3] Fix git lfs submodule expansion Michael Siebold 2026-03-09 21:21 ` [PATCH 1/3] bitbake: gitsm: Add clean function Michael Siebold 2026-03-09 21:21 ` [PATCH 2/3] bitbake: fetch2: Fix incorrect lfs parametrization for submodules Michael Siebold @ 2026-03-09 21:21 ` Michael Siebold 2026-03-09 21:32 ` Richard Purdie 2 siblings, 1 reply; 6+ messages in thread From: Michael Siebold @ 2026-03-09 21:21 UTC (permalink / raw) To: Yoann Congal Cc: bitbake-devel, Philip Lorenz, Richard Purdie, Michael Siebold From: Philip Lorenz <philip.lorenz@bmw.de> Skipping smudging prevents the LFS objects from replacing their placeholder files when `git submodule update` actually checks out the target revision in the submodule. Smudging cannot happen earlier as the clone stored in `.git/modules` is bare. This should be fine as long as all LFS objects are available in the download cache (which they are after the other fixes are applied). (Bitbake rev: d270e33a07c50bb9c08861cf9a6dc51e1fd2d874) Upstream-Status: Backport [from commit 3eeac69385] Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> (cherry picked from commit 3eeac69385e8f29a08d022a17b28b5d504deed66) Signed-off-by: Michael Siebold <michael.siebold@gmail.com> --- bitbake/lib/bb/fetch2/gitsm.py | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/bitbake/lib/bb/fetch2/gitsm.py b/bitbake/lib/bb/fetch2/gitsm.py index 5c98991480..ef19053330 100644 --- a/bitbake/lib/bb/fetch2/gitsm.py +++ b/bitbake/lib/bb/fetch2/gitsm.py @@ -243,12 +243,11 @@ class GitSM(Git): ret = self.process_submodules(ud, ud.destdir, unpack_submodules, d) if not ud.bareclone and ret: - # All submodules should already be downloaded and configured in the tree. This simply - # sets up the configuration and checks out the files. The main project config should - # remain unmodified, and no download from the internet should occur. As such, lfs smudge - # should also be skipped as these files were already smudged in the fetch stage if lfs - # was enabled. - runfetchcmd("GIT_LFS_SKIP_SMUDGE=1 %s submodule update --recursive --no-fetch" % (ud.basecmd), d, quiet=True, workdir=ud.destdir) + cmdprefix = "" + # Avoid LFS smudging (replacing the LFS pointers with the actual content) when LFS shouldn't be used but git-lfs is installed. + if not self._need_lfs(ud): + cmdprefix = "GIT_LFS_SKIP_SMUDGE=1 " + runfetchcmd("%s%s submodule update --recursive --no-fetch" % (cmdprefix, ud.basecmd), d, quiet=True, workdir=ud.destdir) def clean(self, ud, d): def clean_submodule(ud, url, module, modpath, workdir, d): url += ";bareclone=1;nobranch=1" -- 2.34.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 3/3] bitbake: fetch2: Fix LFS object checkout in submodules 2026-03-09 21:21 ` [PATCH 3/3] bitbake: fetch2: Fix LFS object checkout in submodules Michael Siebold @ 2026-03-09 21:32 ` Richard Purdie 2026-03-09 23:14 ` Michael Siebold 0 siblings, 1 reply; 6+ messages in thread From: Richard Purdie @ 2026-03-09 21:32 UTC (permalink / raw) To: Michael Siebold, Yoann Congal; +Cc: bitbake-devel, Philip Lorenz On Mon, 2026-03-09 at 14:21 -0700, Michael Siebold wrote: > From: Philip Lorenz <philip.lorenz@bmw.de> > > Skipping smudging prevents the LFS objects from replacing their > placeholder files when `git submodule update` actually checks out the > target revision in the submodule. Smudging cannot happen earlier as the > clone stored in `.git/modules` is bare. > > This should be fine as long as all LFS objects are available in the > download cache (which they are after the other fixes are applied). > > (Bitbake rev: d270e33a07c50bb9c08861cf9a6dc51e1fd2d874) > > Upstream-Status: Backport [from commit 3eeac69385] > > Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de> > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> > (cherry picked from commit 3eeac69385e8f29a08d022a17b28b5d504deed66) > Signed-off-by: Michael Siebold <michael.siebold@gmail.com> > --- > bitbake/lib/bb/fetch2/gitsm.py | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/bitbake/lib/bb/fetch2/gitsm.py b/bitbake/lib/bb/fetch2/gitsm.py > index 5c98991480..ef19053330 100644 > --- a/bitbake/lib/bb/fetch2/gitsm.py > +++ b/bitbake/lib/bb/fetch2/gitsm.py > @@ -243,12 +243,11 @@ class GitSM(Git): > ret = self.process_submodules(ud, ud.destdir, unpack_submodules, d) > > if not ud.bareclone and ret: > - # All submodules should already be downloaded and configured in the tree. This simply > - # sets up the configuration and checks out the files. The main project config should > - # remain unmodified, and no download from the internet should occur. As such, lfs smudge > - # should also be skipped as these files were already smudged in the fetch stage if lfs > - # was enabled. > - runfetchcmd("GIT_LFS_SKIP_SMUDGE=1 %s submodule update --recursive --no-fetch" % (ud.basecmd), d, quiet=True, workdir=ud.destdir) > + cmdprefix = "" > + # Avoid LFS smudging (replacing the LFS pointers with the actual content) when LFS shouldn't be used but git-lfs is installed. > + if not self._need_lfs(ud): > + cmdprefix = "GIT_LFS_SKIP_SMUDGE=1 " > + runfetchcmd("%s%s submodule update --recursive --no-fetch" % (cmdprefix, ud.basecmd), d, quiet=True, workdir=ud.destdir) > def clean(self, ud, d): > def clean_submodule(ud, url, module, modpath, workdir, d): > url += ";bareclone=1;nobranch=1" We've had a lot of churn on this code and it isn't something I use and fully understand myself so I need to ask some questions to make sure we get this right this time. Is "git submodule update --recursive --no-fetch" going to access the network? If I understand correctly, you say it shouldn't as things should already be in DL_DIR. What happens if they're not? Where are the large files stored in DL_DIR? From the older comments in the code, it sounds like the smudging was meant to happen at do_fetch time and this is now being changed to happen at do_unpack. Put differently, the fetcher code needs to: * ensure software manifests are correct and only specifically referenced things are fetched, no random revisions or accesses outside of what is listed * ensure mirroring works correctly and all artefacts needed (including lfs ones) can be handled by a mirror setting * be reproducible, the same thing will always be fetched for a given url/revision I'd like to be certain this change allows for that and the smudging doesn't bypass things. Also, do we have tests covering this from bitbake-selftest? Cheers, Richard ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 3/3] bitbake: fetch2: Fix LFS object checkout in submodules 2026-03-09 21:32 ` Richard Purdie @ 2026-03-09 23:14 ` Michael Siebold 0 siblings, 0 replies; 6+ messages in thread From: Michael Siebold @ 2026-03-09 23:14 UTC (permalink / raw) To: Richard Purdie; +Cc: Yoann Congal, bitbake-devel, Philip Lorenz [-- Attachment #1: Type: text/plain, Size: 4398 bytes --] Hi Richard, I apologize in advance, I'm far from being an expert in this area myself. These patches are cherry-picking a fix from master into Scarthgap. And I see that the [Scarthgap] tag made it into the cover letter but not in the subsequent patches, sorry about that! I thought about simply requesting this fix be backported into Scarthgap, but my hope is this makes things easier. Best, Michael On Mon, Mar 9, 2026 at 2:33 PM Richard Purdie < richard.purdie@linuxfoundation.org> wrote: > On Mon, 2026-03-09 at 14:21 -0700, Michael Siebold wrote: > > From: Philip Lorenz <philip.lorenz@bmw.de> > > > > Skipping smudging prevents the LFS objects from replacing their > > placeholder files when `git submodule update` actually checks out the > > target revision in the submodule. Smudging cannot happen earlier as the > > clone stored in `.git/modules` is bare. > > > > This should be fine as long as all LFS objects are available in the > > download cache (which they are after the other fixes are applied). > > > > (Bitbake rev: d270e33a07c50bb9c08861cf9a6dc51e1fd2d874) > > > > Upstream-Status: Backport [from commit 3eeac69385] > > > > Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de> > > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> > > (cherry picked from commit 3eeac69385e8f29a08d022a17b28b5d504deed66) > > Signed-off-by: Michael Siebold <michael.siebold@gmail.com> > > --- > > bitbake/lib/bb/fetch2/gitsm.py | 11 +++++------ > > 1 file changed, 5 insertions(+), 6 deletions(-) > > > > diff --git a/bitbake/lib/bb/fetch2/gitsm.py > b/bitbake/lib/bb/fetch2/gitsm.py > > index 5c98991480..ef19053330 100644 > > --- a/bitbake/lib/bb/fetch2/gitsm.py > > +++ b/bitbake/lib/bb/fetch2/gitsm.py > > @@ -243,12 +243,11 @@ class GitSM(Git): > > ret = self.process_submodules(ud, ud.destdir, > unpack_submodules, d) > > > > if not ud.bareclone and ret: > > - # All submodules should already be downloaded and > configured in the tree. This simply > > - # sets up the configuration and checks out the files. The > main project config should > > - # remain unmodified, and no download from the internet > should occur. As such, lfs smudge > > - # should also be skipped as these files were already > smudged in the fetch stage if lfs > > - # was enabled. > > - runfetchcmd("GIT_LFS_SKIP_SMUDGE=1 %s submodule update > --recursive --no-fetch" % (ud.basecmd), d, quiet=True, workdir=ud.destdir) > > + cmdprefix = "" > > + # Avoid LFS smudging (replacing the LFS pointers with the > actual content) when LFS shouldn't be used but git-lfs is installed. > > + if not self._need_lfs(ud): > > + cmdprefix = "GIT_LFS_SKIP_SMUDGE=1 " > > + runfetchcmd("%s%s submodule update --recursive --no-fetch" > % (cmdprefix, ud.basecmd), d, quiet=True, workdir=ud.destdir) > > def clean(self, ud, d): > > def clean_submodule(ud, url, module, modpath, workdir, d): > > url += ";bareclone=1;nobranch=1" > > > We've had a lot of churn on this code and it isn't something I use and > fully understand myself so I need to ask some questions to make sure we > get this right this time. > > Is "git submodule update --recursive --no-fetch" going to access the > network? > > If I understand correctly, you say it shouldn't as things should > already be in DL_DIR. What happens if they're not? Where are the large > files stored in DL_DIR? > > From the older comments in the code, it sounds like the smudging was > meant to happen at do_fetch time and this is now being changed to > happen at do_unpack. > > Put differently, the fetcher code needs to: > > * ensure software manifests are correct and only specifically > referenced things are fetched, no random revisions or accesses outside > of what is listed > * ensure mirroring works correctly and all artefacts needed (including > lfs ones) can be handled by a mirror setting > * be reproducible, the same thing will always be fetched for a given > url/revision > > I'd like to be certain this change allows for that and the smudging > doesn't bypass things. > > Also, do we have tests covering this from bitbake-selftest? > > Cheers, > > Richard > > > > [-- Attachment #2: Type: text/html, Size: 5583 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-09 23:15 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-09 21:21 [scarthgap][PATCH 0/3] Fix git lfs submodule expansion Michael Siebold 2026-03-09 21:21 ` [PATCH 1/3] bitbake: gitsm: Add clean function Michael Siebold 2026-03-09 21:21 ` [PATCH 2/3] bitbake: fetch2: Fix incorrect lfs parametrization for submodules Michael Siebold 2026-03-09 21:21 ` [PATCH 3/3] bitbake: fetch2: Fix LFS object checkout in submodules Michael Siebold 2026-03-09 21:32 ` Richard Purdie 2026-03-09 23:14 ` Michael Siebold
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox