* [PATCH 0/8] Fix gitsm LFS support
@ 2025-04-23 15:18 Philip Lorenz
2025-04-23 15:18 ` [PATCH 1/8] fetch2: Clean up no longer used name parameter Philip Lorenz
` (7 more replies)
0 siblings, 8 replies; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:18 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
This patch series fixes a number of issues we observed when using the
gitsm fetcher in conjunction with git-lfs.
I'd like to highlight two patches in this series:
* 0004-fetch2-Simplify-git-LFS-detection: I could not come up with a
good reason on why the code so far preferred the content of
`.gitattributes` on the repository's branch instead of always
considering the state of the target revision. Please let me know
if there's something I didn't take into account.
* 0005-fetch2-Use-git-lfs-fetch-to-download-objects.patch: I could not
replicate the `git-lfs fetch` behaviour in versions going back to
2.13.2 released in 2021. This leads me to believe that the issue
leading to this workaround is no longer in place. However, if anyone
remembers the actual issue / versions involved I can also retest to
confirm that there are no regression.
Philip Lorenz (8):
fetch2: Clean up no longer used name parameter
tests/fetch: Move commonly used imports to top
fetch2: Check for git-lfs existence before using it
fetch2: Simplify git LFS detection
fetch2: Use git-lfs fetch to download objects
fetch2: Fix incorrect lfs parametrization for submodules
fetch2: Fix LFS object checkout in submodules
tests/fetch: Test gitsm with LFS
lib/bb/fetch2/git.py | 69 +++++++-------------
lib/bb/fetch2/gitsm.py | 9 +--
lib/bb/tests/fetch.py | 139 ++++++++++++++++++++++++++---------------
3 files changed, 115 insertions(+), 102 deletions(-)
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/8] fetch2: Clean up no longer used name parameter
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
@ 2025-04-23 15:18 ` Philip Lorenz
2025-04-23 15:18 ` [PATCH 2/8] tests/fetch: Move commonly used imports to top Philip Lorenz
` (6 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:18 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
There's no need to pass `name` when it is no longer used.
Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
---
lib/bb/fetch2/git.py | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
index b47a53e3b..39c183927 100644
--- a/lib/bb/fetch2/git.py
+++ b/lib/bb/fetch2/git.py
@@ -327,7 +327,7 @@ class Git(FetchMethod):
if self.clonedir_need_update(ud, d):
return True
- if not self._lfs_objects_downloaded(ud, d, ud.name, ud.clonedir):
+ if not self._lfs_objects_downloaded(ud, d, ud.clonedir):
return True
return False
@@ -802,7 +802,7 @@ class Git(FetchMethod):
raise bb.fetch2.FetchError("The command '%s' gave output with more then 1 line unexpectedly, output: '%s'" % (cmd, output))
return output.split()[0] != "0"
- def _lfs_objects_downloaded(self, ud, d, name, wd):
+ def _lfs_objects_downloaded(self, ud, d, wd):
"""
Verifies whether the LFS objects for requested revisions have already been downloaded
"""
@@ -841,7 +841,7 @@ class Git(FetchMethod):
if ud.nobranch:
# If no branch is specified, use the current git commit
- refname = self._build_revision(ud, d, ud.name)
+ refname = ud.revision
elif wd == ud.clonedir:
# The bare clonedir doesn't use the remote names; it has the branch immediately.
refname = ud.branch
@@ -995,7 +995,7 @@ class Git(FetchMethod):
Return a sortable revision number by counting commits in the history
Based on gitpkgv.bblass in meta-openembedded
"""
- rev = self._build_revision(ud, d, name)
+ rev = ud.revision
localpath = ud.localpath
rev_file = os.path.join(localpath, "oe-gitpkgv_" + rev)
if not os.path.exists(localpath):
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/8] tests/fetch: Move commonly used imports to top
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
2025-04-23 15:18 ` [PATCH 1/8] fetch2: Clean up no longer used name parameter Philip Lorenz
@ 2025-04-23 15:18 ` Philip Lorenz
2025-04-23 15:18 ` [PATCH 3/8] fetch2: Check for git-lfs existence before using it Philip Lorenz
` (5 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:18 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
Avoid multiple import statements for anything that is used more than
once. Additionally, drop no longer used imports.
Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
---
lib/bb/tests/fetch.py | 19 ++-----------------
1 file changed, 2 insertions(+), 17 deletions(-)
diff --git a/lib/bb/tests/fetch.py b/lib/bb/tests/fetch.py
index 196d93c41..f0c628524 100644
--- a/lib/bb/tests/fetch.py
+++ b/lib/bb/tests/fetch.py
@@ -7,7 +7,9 @@
#
import contextlib
+import shutil
import unittest
+import urllib.parse
import hashlib
import tempfile
import collections
@@ -1275,7 +1277,6 @@ class FetcherNetworkTest(FetcherTest):
class SVNTest(FetcherTest):
def skipIfNoSvn():
- import shutil
if not shutil.which("svn"):
return unittest.skip("svn not installed, tests being skipped")
@@ -1398,8 +1399,6 @@ class TrustedNetworksTest(FetcherTest):
self.assertFalse(bb.fetch.trusted_network(self.d, url))
class URLHandle(unittest.TestCase):
- import urllib.parse
-
# Quote password as per RFC3986
password = urllib.parse.quote(r"!#$%^&*()-_={}[]\|:?,.<>~`", r"!$&'/()*+,;=")
datatable = {
@@ -1426,7 +1425,6 @@ class URLHandle(unittest.TestCase):
self.assertEqual(result, v)
def test_encodeurl(self):
- import urllib.parse
for k, v in self.datatable.items():
result = bb.fetch.encodeurl(v)
if result.startswith("file:"):
@@ -2271,7 +2269,6 @@ class GitShallowTest(FetcherTest):
class GitLfsTest(FetcherTest):
def skipIfNoGitLFS():
- import shutil
if not shutil.which('git-lfs'):
return unittest.skip('git-lfs not installed')
return lambda f: f
@@ -2391,8 +2388,6 @@ class GitLfsTest(FetcherTest):
@skipIfNoGitLFS()
def test_lfs_enabled(self):
- import shutil
-
uri = 'git://%s;protocol=file;lfs=1;branch=master' % self.srcdir
self.d.setVar('SRC_URI', uri)
@@ -2403,8 +2398,6 @@ class GitLfsTest(FetcherTest):
@skipIfNoGitLFS()
def test_lfs_disabled(self):
- import shutil
-
uri = 'git://%s;protocol=file;lfs=0;branch=master' % self.srcdir
self.d.setVar('SRC_URI', uri)
@@ -2414,8 +2407,6 @@ class GitLfsTest(FetcherTest):
fetcher.unpack(self.d.getVar('WORKDIR'))
def test_lfs_enabled_not_installed(self):
- import shutil
-
uri = 'git://%s;protocol=file;lfs=1;branch=master' % self.srcdir
self.d.setVar('SRC_URI', uri)
@@ -2436,8 +2427,6 @@ class GitLfsTest(FetcherTest):
ud.method._find_git_lfs = old_find_git_lfs
def test_lfs_disabled_not_installed(self):
- import shutil
-
uri = 'git://%s;protocol=file;lfs=0;branch=master' % self.srcdir
self.d.setVar('SRC_URI', uri)
@@ -2611,7 +2600,6 @@ class CrateTest(FetcherTest):
class NPMTest(FetcherTest):
def skipIfNoNpm():
- import shutil
if not shutil.which('npm'):
return unittest.skip('npm not installed')
return lambda f: f
@@ -3294,7 +3282,6 @@ class FetchPremirroronlyNetworkTest(FetcherTest):
self.d.setVar("PREMIRRORS", self.recipe_url + " " + "file://{}".format(self.mirrordir) + " \n")
def make_git_repo(self):
- import shutil
self.mirrorname = "git2_git.yoctoproject.org.fstests.tar.gz"
os.makedirs(self.clonedir)
self.git("clone --bare {}".format(self.recipe_url), self.clonedir)
@@ -3324,7 +3311,6 @@ class FetchPremirroronlyMercurialTest(FetcherTest):
the test covers also basic hg:// clone (see fetch_and_create_tarball
"""
def skipIfNoHg():
- import shutil
if not shutil.which('hg'):
return unittest.skip('Mercurial not installed')
return lambda f: f
@@ -3380,7 +3366,6 @@ class FetchPremirroronlyBrokenTarball(FetcherTest):
targz.write("This is not tar.gz file!")
def test_mirror_broken_download(self):
- import sys
self.d.setVar("SRCREV", "0"*40)
fetcher = bb.fetch.Fetch([self.recipe_url], self.d)
with self.assertRaises(bb.fetch2.FetchError), self.assertLogs() as logs:
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/8] fetch2: Check for git-lfs existence before using it
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
2025-04-23 15:18 ` [PATCH 1/8] fetch2: Clean up no longer used name parameter Philip Lorenz
2025-04-23 15:18 ` [PATCH 2/8] tests/fetch: Move commonly used imports to top Philip Lorenz
@ 2025-04-23 15:18 ` Philip Lorenz
2025-04-24 7:10 ` [bitbake-devel] " Mathieu Dubois-Briand
2025-04-23 15:18 ` [PATCH 4/8] fetch2: Simplify git LFS detection Philip Lorenz
` (4 subsequent siblings)
7 siblings, 1 reply; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:18 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
So far, existence of `git-lfs` was only checked during unpacking. As the
binary is also used in earlier steps also check for its existence there.
Additionally, factor out the LFS existence check into a dedicated
function and call it wherever git-lfs is used for the first time.
Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
---
lib/bb/fetch2/git.py | 26 ++++++++++++++++++--------
lib/bb/tests/fetch.py | 26 ++++++++++----------------
2 files changed, 28 insertions(+), 24 deletions(-)
diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
index 39c183927..b32b18797 100644
--- a/lib/bb/fetch2/git.py
+++ b/lib/bb/fetch2/git.py
@@ -324,6 +324,11 @@ class Git(FetchMethod):
return False
def lfs_need_update(self, ud, d):
+ if not self._need_lfs(ud):
+ return False
+
+ self._ensure_git_lfs(d, ud)
+
if self.clonedir_need_update(ud, d):
return True
@@ -507,7 +512,9 @@ class Git(FetchMethod):
def lfs_fetch(self, ud, d, clonedir, revision, fetchall=False, progresshandler=None):
"""Helper method for fetching Git LFS data"""
try:
- if self._need_lfs(ud) and self._contains_lfs(ud, d, clonedir) and self._find_git_lfs(d) and len(revision):
+ if self._need_lfs(ud) and self._contains_lfs(ud, d, clonedir) and len(revision):
+ self._ensure_git_lfs(d, ud)
+
# Using worktree with the revision because .lfsconfig may exists
worktree_add_cmd = "%s worktree add wt %s" % (ud.basecmd, revision)
runfetchcmd(worktree_add_cmd, d, log=progresshandler, workdir=clonedir)
@@ -740,11 +747,11 @@ class Git(FetchMethod):
runfetchcmd("%s remote set-url origin %s" % (ud.basecmd, shlex.quote(repourl)), d, workdir=destdir)
if self._contains_lfs(ud, d, destdir):
- if need_lfs and not self._find_git_lfs(d):
- raise bb.fetch2.FetchError("Repository %s has LFS content, install git-lfs on host to download (or set lfs=0 to ignore it)" % (repourl))
- elif not need_lfs:
+ if not need_lfs:
bb.note("Repository %s has LFS content but it is not being fetched" % (repourl))
else:
+ self._ensure_git_lfs(d, ud)
+
runfetchcmd("%s lfs install --local" % ud.basecmd, d, workdir=destdir)
if not ud.nocheckout:
@@ -807,7 +814,7 @@ class Git(FetchMethod):
Verifies whether the LFS objects for requested revisions have already been downloaded
"""
# Bail out early if this repository doesn't use LFS
- if not self._need_lfs(ud) or not self._contains_lfs(ud, d, wd):
+ if not self._contains_lfs(ud, d, wd):
return True
# The Git LFS specification specifies ([1]) the LFS folder layout so it should be safe to check for file
@@ -859,11 +866,14 @@ class Git(FetchMethod):
pass
return False
- def _find_git_lfs(self, d):
+ def _ensure_git_lfs(self, d, ud):
"""
- Return True if git-lfs can be found, False otherwise.
+ Ensures that git-lfs is available, raising a FetchError if it isn't.
"""
- return shutil.which("git-lfs", path=d.getVar('PATH')) is not None
+ if shutil.which("git-lfs", path=d.getVar('PATH')) is None:
+ raise bb.fetch2.FetchError(
+ "Repository %s has LFS content, install git-lfs on host to download (or set lfs=0 "
+ "to ignore it)" % self._get_repo_url(ud))
def _get_repo_url(self, ud):
"""
diff --git a/lib/bb/tests/fetch.py b/lib/bb/tests/fetch.py
index f0c628524..a936af6fb 100644
--- a/lib/bb/tests/fetch.py
+++ b/lib/bb/tests/fetch.py
@@ -9,6 +9,7 @@
import contextlib
import shutil
import unittest
+import unittest.mock
import urllib.parse
import hashlib
import tempfile
@@ -2413,18 +2414,17 @@ class GitLfsTest(FetcherTest):
# Careful: suppress initial attempt at downloading
fetcher, ud = self.fetch(uri=None, download=False)
- # Artificially assert that git-lfs is not installed, so
- # we can verify a failure to unpack in it's absence.
- old_find_git_lfs = ud.method._find_git_lfs
- try:
- # If git-lfs cannot be found, the unpack should throw an error
+ # If git-lfs cannot be found, the download should throw an error
+ with unittest.mock.patch("shutil.which", return_value=None):
with self.assertRaises(bb.fetch2.FetchError):
fetcher.download()
- ud.method._find_git_lfs = lambda d: False
+
+ fetcher.download()
+ # If git-lfs cannot be found, the unpack should throw an error
+ with self.assertRaises(bb.fetch2.FetchError):
+ with unittest.mock.patch("shutil.which", return_value=None):
shutil.rmtree(self.gitdir, ignore_errors=True)
fetcher.unpack(self.d.getVar('WORKDIR'))
- finally:
- ud.method._find_git_lfs = old_find_git_lfs
def test_lfs_disabled_not_installed(self):
uri = 'git://%s;protocol=file;lfs=0;branch=master' % self.srcdir
@@ -2433,17 +2433,11 @@ class GitLfsTest(FetcherTest):
# Careful: suppress initial attempt at downloading
fetcher, ud = self.fetch(uri=None, download=False)
- # Artificially assert that git-lfs is not installed, so
- # we can verify a failure to unpack in it's absence.
- old_find_git_lfs = ud.method._find_git_lfs
- try:
- # Even if git-lfs cannot be found, the unpack should be successful
+ # Even if git-lfs cannot be found, the download / unpack should be successful
+ with unittest.mock.patch("shutil.which", return_value=None):
fetcher.download()
- ud.method._find_git_lfs = lambda d: False
shutil.rmtree(self.gitdir, ignore_errors=True)
fetcher.unpack(self.d.getVar('WORKDIR'))
- finally:
- ud.method._find_git_lfs = old_find_git_lfs
class GitURLWithSpacesTest(FetcherTest):
test_git_urls = {
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/8] fetch2: Simplify git LFS detection
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
` (2 preceding siblings ...)
2025-04-23 15:18 ` [PATCH 3/8] fetch2: Check for git-lfs existence before using it Philip Lorenz
@ 2025-04-23 15:18 ` Philip Lorenz
2025-04-23 15:18 ` [PATCH 5/8] fetch2: Use git-lfs fetch to download objects Philip Lorenz
` (3 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:18 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
Its unclear why this function does not operate on the desired source
revision to begin with (which really should be the decider on whether a
particular source revision uses LFS or not). Simplify the decision logic
by always checking the `.gitattributes` file of the target revision.
Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
---
lib/bb/fetch2/git.py | 12 +-----------
1 file changed, 1 insertion(+), 11 deletions(-)
diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
index b32b18797..a1dd12bf7 100644
--- a/lib/bb/fetch2/git.py
+++ b/lib/bb/fetch2/git.py
@@ -845,18 +845,8 @@ class Git(FetchMethod):
"""
Check if the repository has 'lfs' (large file) content
"""
-
- if ud.nobranch:
- # If no branch is specified, use the current git commit
- refname = ud.revision
- elif wd == ud.clonedir:
- # The bare clonedir doesn't use the remote names; it has the branch immediately.
- refname = ud.branch
- else:
- refname = "origin/%s" % ud.branch
-
cmd = "%s grep lfs %s:.gitattributes | wc -l" % (
- ud.basecmd, refname)
+ ud.basecmd, ud.revision)
try:
output = runfetchcmd(cmd, d, quiet=True, workdir=wd)
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/8] fetch2: Use git-lfs fetch to download objects
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
` (3 preceding siblings ...)
2025-04-23 15:18 ` [PATCH 4/8] fetch2: Simplify git LFS detection Philip Lorenz
@ 2025-04-23 15:18 ` Philip Lorenz
2025-04-23 15:18 ` [PATCH 6/8] fetch2: Fix incorrect lfs parametrization for submodules Philip Lorenz
` (2 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:18 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
Its not clear which exact git-lfs versions failed to work with bare
repositories, however git-lfs 2.13.2 which is shipped by Debian 10 (i.e.
the oldest supported distribution by scarthgap) shows no issue when
fetching into a bare repository. Switch to git-lfs fetch which in turn
eliminates issues seen when using the gitsm fetcher with submodules
utilizing lfs.
In these scenarios, fetching of LFS objects did not actually happen as
the gitsm fetcher parametrizes the to be fetched repositories with
`bareclone=1` which in turn means that the target revision was never
checked out (and therefore no LFS objects were fetched).
Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
---
lib/bb/fetch2/git.py | 25 +------------------------
1 file changed, 1 insertion(+), 24 deletions(-)
diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
index a1dd12bf7..341bbf0ed 100644
--- a/lib/bb/fetch2/git.py
+++ b/lib/bb/fetch2/git.py
@@ -484,30 +484,7 @@ class Git(FetchMethod):
raise bb.fetch2.FetchError("Unable to find revision %s even from upstream" % missing_rev)
if self.lfs_need_update(ud, d):
- # Unpack temporary working copy, use it to run 'git checkout' to force pre-fetching
- # of all LFS blobs needed at the srcrev.
- #
- # It would be nice to just do this inline here by running 'git-lfs fetch'
- # on the bare clonedir, but that operation requires a working copy on some
- # releases of Git LFS.
- with tempfile.TemporaryDirectory(dir=d.getVar('DL_DIR')) as tmpdir:
- # Do the checkout. This implicitly involves a Git LFS fetch.
- Git.unpack(self, ud, tmpdir, d)
-
- # Scoop up a copy of any stuff that Git LFS downloaded. Merge them into
- # the bare clonedir.
- #
- # As this procedure is invoked repeatedly on incremental fetches as
- # a recipe's SRCREV is bumped throughout its lifetime, this will
- # result in a gradual accumulation of LFS blobs in <ud.clonedir>/lfs
- # corresponding to all the blobs reachable from the different revs
- # fetched across time.
- #
- # Only do this if the unpack resulted in a .git/lfs directory being
- # created; this only happens if at least one blob needed to be
- # downloaded.
- if os.path.exists(os.path.join(ud.destdir, ".git", "lfs")):
- runfetchcmd("tar -cf - lfs | tar -xf - -C %s" % ud.clonedir, d, workdir="%s/.git" % ud.destdir)
+ self.lfs_fetch(ud, d, ud.clonedir, ud.revision)
def lfs_fetch(self, ud, d, clonedir, revision, fetchall=False, progresshandler=None):
"""Helper method for fetching Git LFS data"""
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6/8] fetch2: Fix incorrect lfs parametrization for submodules
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
` (4 preceding siblings ...)
2025-04-23 15:18 ` [PATCH 5/8] fetch2: Use git-lfs fetch to download objects Philip Lorenz
@ 2025-04-23 15:18 ` Philip Lorenz
2025-04-23 15:19 ` [PATCH 7/8] fetch2: Fix LFS object checkout in submodules Philip Lorenz
2025-04-23 15:19 ` [PATCH 8/8] tests/fetch: Test gitsm with LFS Philip Lorenz
7 siblings, 0 replies; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:18 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
The existing code would pass `True` or `False` to the git fetcher. As
the fetcher expects `lfs` to be set to `1` this always lead to LFS
fetching being disabled.
Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
---
lib/bb/fetch2/gitsm.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/bb/fetch2/gitsm.py b/lib/bb/fetch2/gitsm.py
index f514aedaf..aeeb3cf63 100644
--- a/lib/bb/fetch2/gitsm.py
+++ b/lib/bb/fetch2/gitsm.py
@@ -122,7 +122,7 @@ class GitSM(Git):
url += ";name=%s" % module
url += ";subpath=%s" % module
url += ";nobranch=1"
- url += ";lfs=%s" % self._need_lfs(ud)
+ url += ";lfs=%s" % "1" if self._need_lfs(ud) else "0"
# Note that adding "user=" here to give credentials to the
# submodule is not supported. Since using SRC_URI to give git://
# URL a password is not supported, one have to use one of the
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 7/8] fetch2: Fix LFS object checkout in submodules
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
` (5 preceding siblings ...)
2025-04-23 15:18 ` [PATCH 6/8] fetch2: Fix incorrect lfs parametrization for submodules Philip Lorenz
@ 2025-04-23 15:19 ` Philip Lorenz
2025-04-23 15:19 ` [PATCH 8/8] tests/fetch: Test gitsm with LFS Philip Lorenz
7 siblings, 0 replies; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:19 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
Skipping smudging prevents the LFS objects from replacing their
placeholder files when `git submodule update` actually clones the
submodules into the repository (from the local download cache). Drop
`GIT_LFS_SKIP_SMUDGE=1` to fix this.
As long as all LFS objects are available in the download cache (which
they are after the other fixes are applied) the network will not be
accessed anyhow.
Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
---
lib/bb/fetch2/gitsm.py | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/lib/bb/fetch2/gitsm.py b/lib/bb/fetch2/gitsm.py
index aeeb3cf63..36e760e63 100644
--- a/lib/bb/fetch2/gitsm.py
+++ b/lib/bb/fetch2/gitsm.py
@@ -245,12 +245,7 @@ class GitSM(Git):
ret = self.process_submodules(ud, ud.destdir, unpack_submodules, d)
if not ud.bareclone and ret:
- # All submodules should already be downloaded and configured in the tree. This simply
- # sets up the configuration and checks out the files. The main project config should
- # remain unmodified, and no download from the internet should occur. As such, lfs smudge
- # should also be skipped as these files were already smudged in the fetch stage if lfs
- # was enabled.
- runfetchcmd("GIT_LFS_SKIP_SMUDGE=1 %s submodule update --recursive --no-fetch" % (ud.basecmd), d, quiet=True, workdir=ud.destdir)
+ runfetchcmd("%s submodule update --recursive --no-fetch" % (ud.basecmd), d, quiet=True, workdir=ud.destdir)
def clean(self, ud, d):
def clean_submodule(ud, url, module, modpath, workdir, d):
url += ";bareclone=1;nobranch=1"
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 8/8] tests/fetch: Test gitsm with LFS
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
` (6 preceding siblings ...)
2025-04-23 15:19 ` [PATCH 7/8] fetch2: Fix LFS object checkout in submodules Philip Lorenz
@ 2025-04-23 15:19 ` Philip Lorenz
7 siblings, 0 replies; 13+ messages in thread
From: Philip Lorenz @ 2025-04-23 15:19 UTC (permalink / raw)
To: bitbake-devel; +Cc: Philip Lorenz
Add a test case to verify that the gitsm fetcher properly handles
repositories storing objects with LFS.
The test case verifies that LFS objects are fetched on the initial clone
but also ensures that consecutive updates extend the original clone with
any newly referenced LFS objects.
Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
---
lib/bb/tests/fetch.py | 94 +++++++++++++++++++++++++++++++++++--------
1 file changed, 78 insertions(+), 16 deletions(-)
diff --git a/lib/bb/tests/fetch.py b/lib/bb/tests/fetch.py
index a936af6fb..70a5eea22 100644
--- a/lib/bb/tests/fetch.py
+++ b/lib/bb/tests/fetch.py
@@ -20,6 +20,7 @@ import tarfile
from bb.fetch2 import URI
from bb.fetch2 import FetchMethod
import bb
+import bb.utils
from bb.tests.support.httpserver import HTTPService
def skipIfNoNetwork():
@@ -27,6 +28,18 @@ def skipIfNoNetwork():
return unittest.skip("network test")
return lambda f: f
+
+@contextlib.contextmanager
+def hide_directory(directory):
+ """Hide the given directory and restore it after the context is left"""
+ temp_name = directory + ".bak"
+ os.rename(directory, temp_name)
+ try:
+ yield
+ finally:
+ os.rename(temp_name, directory)
+
+
class TestTimeout(Exception):
# Indicate to pytest that this is not a test suite
__test__ = False
@@ -2293,12 +2306,18 @@ class GitLfsTest(FetcherTest):
self.git_init(cwd=self.srcdir)
self.commit_file('.gitattributes', '*.mp3 filter=lfs -text')
- def commit_file(self, filename, content):
- with open(os.path.join(self.srcdir, filename), "w") as f:
+ def commit(self, *, cwd=None):
+ cwd = cwd or self.srcdir
+ self.git(["commit", "-m", "Change"], cwd=cwd)
+ return self.git(["rev-parse", "HEAD"], cwd=cwd).strip()
+
+ def commit_file(self, filename, content, *, cwd=None):
+ cwd = cwd or self.srcdir
+
+ with open(os.path.join(cwd, filename), "w") as f:
f.write(content)
- self.git(["add", filename], cwd=self.srcdir)
- self.git(["commit", "-m", "Change"], cwd=self.srcdir)
- return self.git(["rev-parse", "HEAD"], cwd=self.srcdir).strip()
+ self.git(["add", filename], cwd=cwd)
+ return self.commit(cwd=cwd)
def fetch(self, uri=None, download=True):
uris = self.d.getVar('SRC_URI').split()
@@ -2318,26 +2337,69 @@ class GitLfsTest(FetcherTest):
unpacked_lfs_file = os.path.join(self.d.getVar('WORKDIR'), 'git', "Cat_poster_1.jpg")
return unpacked_lfs_file
+ @skipIfNoGitLFS()
+ def test_gitsm_lfs(self):
+ """Test that the gitsm fetcher caches objects stored via LFS"""
+ self.git(["lfs", "install", "--local"], cwd=self.srcdir)
+
+ def fetch_and_verify(revision, filename, content):
+ self.d.setVar('SRCREV', revision)
+ fetcher, ud = self.fetch()
+
+ with hide_directory(submoduledir), hide_directory(self.srcdir):
+ workdir = self.d.getVar('WORKDIR')
+ fetcher.unpack(workdir)
+
+ with open(os.path.join(workdir, "git", filename)) as f:
+ self.assertEqual(f.read(), content)
+
+ # Create the git repository that will later be used as a submodule
+ submoduledir = self.tempdir + "/submodule"
+ bb.utils.mkdirhier(submoduledir)
+ self.git_init(submoduledir)
+ self.commit_file('.gitattributes', '*.mp3 filter=lfs -text', cwd=submoduledir)
+
+ submodule_commit_1 = self.commit_file("a.mp3", "submodule version 1", cwd=submoduledir)
+ _ = self.commit_file("a.mp3", "submodule version 2", cwd=submoduledir)
+
+ # Add the submodule to the repository at its current HEAD revision
+ self.git(["-c", "protocol.file.allow=always", "submodule", "add", submoduledir, "submodule"],
+ cwd=self.srcdir)
+ base_commit_1 = self.commit()
+
+ # Let the submodule point at a different revision
+ self.git(["checkout", submodule_commit_1], self.srcdir + "/submodule")
+ self.git(["add", "submodule"], cwd=self.srcdir)
+ base_commit_2 = self.commit()
+
+ # Add a LFS file to the repository
+ base_commit_3 = self.commit_file("a.mp3", "version 1")
+ # Update the added LFS file
+ base_commit_4 = self.commit_file("a.mp3", "version 2")
+
+ self.d.setVar('SRC_URI', "gitsm://%s;protocol=file;lfs=1;branch=master" % self.srcdir)
+
+ # Verify that LFS objects referenced from submodules are fetched and checked out
+ fetch_and_verify(base_commit_1, "submodule/a.mp3", "submodule version 2")
+ # Verify that the repository inside the download cache of a submodile is extended with any
+ # additional LFS objects needed when checking out a different revision.
+ fetch_and_verify(base_commit_2, "submodule/a.mp3", "submodule version 1")
+ # Verify that LFS objects referenced from the base repository are fetched and checked out
+ fetch_and_verify(base_commit_3, "a.mp3", "version 1")
+ # Verify that the cached repository is extended with any additional LFS objects required
+ # when checking out a different revision.
+ fetch_and_verify(base_commit_4, "a.mp3", "version 2")
+
@skipIfNoGitLFS()
def test_fetch_lfs_on_srcrev_change(self):
"""Test if fetch downloads missing LFS objects when a different revision within an existing repository is requested"""
self.git(["lfs", "install", "--local"], cwd=self.srcdir)
- @contextlib.contextmanager
- def hide_upstream_repository():
- """Hide the upstream repository to make sure that git lfs cannot pull from it"""
- temp_name = self.srcdir + ".bak"
- os.rename(self.srcdir, temp_name)
- try:
- yield
- finally:
- os.rename(temp_name, self.srcdir)
-
def fetch_and_verify(revision, filename, content):
self.d.setVar('SRCREV', revision)
fetcher, ud = self.fetch()
- with hide_upstream_repository():
+ with hide_directory(self.srcdir):
workdir = self.d.getVar('WORKDIR')
fetcher.unpack(workdir)
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [bitbake-devel] [PATCH 3/8] fetch2: Check for git-lfs existence before using it
2025-04-23 15:18 ` [PATCH 3/8] fetch2: Check for git-lfs existence before using it Philip Lorenz
@ 2025-04-24 7:10 ` Mathieu Dubois-Briand
2025-04-24 7:48 ` Philip Lorenz
0 siblings, 1 reply; 13+ messages in thread
From: Mathieu Dubois-Briand @ 2025-04-24 7:10 UTC (permalink / raw)
To: philip.lorenz, bitbake-devel
On Wed Apr 23, 2025 at 5:18 PM CEST, Philip Lorenz via lists.openembedded.org wrote:
> So far, existence of `git-lfs` was only checked during unpacking. As the
> binary is also used in earlier steps also check for its existence there.
>
> Additionally, factor out the LFS existence check into a dedicated
> function and call it wherever git-lfs is used for the first time.
>
> Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
> ---
Hi Philip,
Thanks for your patch.
I was not able to test this series on the autobuilder so far, as I
believe git-lfs is not installed in there. Now maybe this has to be a
new requirement on the host and maybe this is fine. Just two points:
- Is that really a new requirement and is it needed?
- We probably need to update a bit of documentation about that.
--
Mathieu Dubois-Briand, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bitbake-devel] [PATCH 3/8] fetch2: Check for git-lfs existence before using it
2025-04-24 7:10 ` [bitbake-devel] " Mathieu Dubois-Briand
@ 2025-04-24 7:48 ` Philip Lorenz
2025-04-24 11:21 ` Alexander Kanavin
0 siblings, 1 reply; 13+ messages in thread
From: Philip Lorenz @ 2025-04-24 7:48 UTC (permalink / raw)
To: Mathieu Dubois-Briand; +Cc: bitbake-devel
Hi Mathieu,
On 24.04.25 09:10, Mathieu Dubois-Briand wrote:
> Sent from outside the BMW organization - be CAUTIOUS, particularly with links and attachments.
> Absender außerhalb der BMW Organisation - Bitte VORSICHT beim Öffnen von Links und Anhängen.
> -----------------------------------------------------------------------------------------------
>
> On Wed Apr 23, 2025 at 5:18 PM CEST, Philip Lorenz via lists.openembedded.org wrote:
>> So far, existence of `git-lfs` was only checked during unpacking. As the
>> binary is also used in earlier steps also check for its existence there.
>>
>> Additionally, factor out the LFS existence check into a dedicated
>> function and call it wherever git-lfs is used for the first time.
>>
>> Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
>> ---
>
>
> I was not able to test this series on the autobuilder so far, as I
> believe git-lfs is not installed in there. Now maybe this has to be a
> new requirement on the host and maybe this is fine. Just two points:
> - Is that really a new requirement and is it needed?
The requirement isn't new but simply was never checked before (so users
would've seen a git-lfs not found error rather than the helpful error
message that is raised during unpacking). Support for LFS can be
explicitly disabled for per URI by adding lfs=0 to its parameter list.
git-lfs is already needed during downloading as this is the point in
time at which the large file objects need to be retrieved. During unpack
pointers in the repository (this process is called smudging) are
replaced by those downloaded objects.
However I just noticed that I was missing parentheses in patch 6 (it
should have been `url += ";lfs=%s" % ("1" if self._need_lfs(ud) else
"0")`) which would explain any possible failures (for example in
vulkan-samples). If there are no further comments on this series I'll
push a revised series in the next couple of days.
> - We probably need to update a bit of documentation about that.
I would've hoped that this is already documented since the LFS feature
has been around for quite a while (but have to admit that I couldn't
find anything on it inside the bitbake documentation). This patch series
merely ensures that it also works correctly when used via the "gitsm"
fetcher (and while at it improves some of the existing implementation).
I'm happy to write up something for the documentation but I'd suggest to
send this in via a dedicated patch series.
Philip
--
Philip Lorenz
BMW Car IT GmbH, Software-Plattform, -Integration Connected Company, Lise-Meitner-Straße 14, 89081 Ulm
-------------------------------------------------------------------------
BMW Car IT GmbH
Management: Chris Brandt, Michael Böttrich, Christian Salzmann
Domicile and Court of Registry: München HRB 134810
-------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bitbake-devel] [PATCH 3/8] fetch2: Check for git-lfs existence before using it
2025-04-24 7:48 ` Philip Lorenz
@ 2025-04-24 11:21 ` Alexander Kanavin
2025-04-24 14:29 ` Richard Purdie
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Kanavin @ 2025-04-24 11:21 UTC (permalink / raw)
To: philip.lorenz; +Cc: Mathieu Dubois-Briand, bitbake-devel
We probably should install git-lfs on the autobuilder workers to
prevent this from regressing, but keep it as a 'soft' host tool
dependency otherwise?
Alex
On Thu, 24 Apr 2025 at 09:48, Philip Lorenz via lists.openembedded.org
<philip.lorenz=bmw.de@lists.openembedded.org> wrote:
>
> Hi Mathieu,
>
> On 24.04.25 09:10, Mathieu Dubois-Briand wrote:
> > Sent from outside the BMW organization - be CAUTIOUS, particularly with links and attachments.
> > Absender außerhalb der BMW Organisation - Bitte VORSICHT beim Öffnen von Links und Anhängen.
> > -----------------------------------------------------------------------------------------------
> >
> > On Wed Apr 23, 2025 at 5:18 PM CEST, Philip Lorenz via lists.openembedded.org wrote:
> >> So far, existence of `git-lfs` was only checked during unpacking. As the
> >> binary is also used in earlier steps also check for its existence there.
> >>
> >> Additionally, factor out the LFS existence check into a dedicated
> >> function and call it wherever git-lfs is used for the first time.
> >>
> >> Signed-off-by: Philip Lorenz <philip.lorenz@bmw.de>
> >> ---
> >
> >
> > I was not able to test this series on the autobuilder so far, as I
> > believe git-lfs is not installed in there. Now maybe this has to be a
> > new requirement on the host and maybe this is fine. Just two points:
> > - Is that really a new requirement and is it needed?
>
> The requirement isn't new but simply was never checked before (so users
> would've seen a git-lfs not found error rather than the helpful error
> message that is raised during unpacking). Support for LFS can be
> explicitly disabled for per URI by adding lfs=0 to its parameter list.
> git-lfs is already needed during downloading as this is the point in
> time at which the large file objects need to be retrieved. During unpack
> pointers in the repository (this process is called smudging) are
> replaced by those downloaded objects.
>
> However I just noticed that I was missing parentheses in patch 6 (it
> should have been `url += ";lfs=%s" % ("1" if self._need_lfs(ud) else
> "0")`) which would explain any possible failures (for example in
> vulkan-samples). If there are no further comments on this series I'll
> push a revised series in the next couple of days.
>
> > - We probably need to update a bit of documentation about that.
>
> I would've hoped that this is already documented since the LFS feature
> has been around for quite a while (but have to admit that I couldn't
> find anything on it inside the bitbake documentation). This patch series
> merely ensures that it also works correctly when used via the "gitsm"
> fetcher (and while at it improves some of the existing implementation).
> I'm happy to write up something for the documentation but I'd suggest to
> send this in via a dedicated patch series.
>
> Philip
>
>
> --
> Philip Lorenz
> BMW Car IT GmbH, Software-Plattform, -Integration Connected Company, Lise-Meitner-Straße 14, 89081 Ulm
> -------------------------------------------------------------------------
> BMW Car IT GmbH
> Management: Chris Brandt, Michael Böttrich, Christian Salzmann
> Domicile and Court of Registry: München HRB 134810
> -------------------------------------------------------------------------
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#17552): https://lists.openembedded.org/g/bitbake-devel/message/17552
> Mute This Topic: https://lists.openembedded.org/mt/112416098/1686489
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [alex.kanavin@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [bitbake-devel] [PATCH 3/8] fetch2: Check for git-lfs existence before using it
2025-04-24 11:21 ` Alexander Kanavin
@ 2025-04-24 14:29 ` Richard Purdie
0 siblings, 0 replies; 13+ messages in thread
From: Richard Purdie @ 2025-04-24 14:29 UTC (permalink / raw)
To: alex.kanavin, philip.lorenz; +Cc: Mathieu Dubois-Briand, bitbake-devel
On Thu, 2025-04-24 at 13:21 +0200, Alexander Kanavin via
lists.openembedded.org wrote:
> We probably should install git-lfs on the autobuilder workers to
> prevent this from regressing, but keep it as a 'soft' host tool
> dependency otherwise?
I'm a little reluctant. We try to keep the autobuilders "lean" so that
we test our minimal requirements work. Adding this in makes it all too
easy to then have lfs dependencies creep in.
The autobuilders install should in theory match our minimum
requirements as documented in our docs.
Cheers,
Richard
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-04-24 14:29 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-23 15:18 [PATCH 0/8] Fix gitsm LFS support Philip Lorenz
2025-04-23 15:18 ` [PATCH 1/8] fetch2: Clean up no longer used name parameter Philip Lorenz
2025-04-23 15:18 ` [PATCH 2/8] tests/fetch: Move commonly used imports to top Philip Lorenz
2025-04-23 15:18 ` [PATCH 3/8] fetch2: Check for git-lfs existence before using it Philip Lorenz
2025-04-24 7:10 ` [bitbake-devel] " Mathieu Dubois-Briand
2025-04-24 7:48 ` Philip Lorenz
2025-04-24 11:21 ` Alexander Kanavin
2025-04-24 14:29 ` Richard Purdie
2025-04-23 15:18 ` [PATCH 4/8] fetch2: Simplify git LFS detection Philip Lorenz
2025-04-23 15:18 ` [PATCH 5/8] fetch2: Use git-lfs fetch to download objects Philip Lorenz
2025-04-23 15:18 ` [PATCH 6/8] fetch2: Fix incorrect lfs parametrization for submodules Philip Lorenz
2025-04-23 15:19 ` [PATCH 7/8] fetch2: Fix LFS object checkout in submodules Philip Lorenz
2025-04-23 15:19 ` [PATCH 8/8] tests/fetch: Test gitsm with LFS Philip Lorenz
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.