From: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
To: Robert Yang <liezhi.yang@windriver.com>,
"bitbake-devel@lists.openembedded.org"
<bitbake-devel@lists.openembedded.org>
Subject: RE: [bitbake-devel] [RFC][PATCH V2] bitbake: fetch2/git: Use git fetch to shallow clone revisions
Date: Mon, 29 Aug 2022 10:46:15 +0000 [thread overview]
Message-ID: <d9b06a8c3f874ee386fb80d0ada7b24e@axis.com> (raw)
In-Reply-To: <7e6cdfb0-3022-69fb-f835-54bf7f042962@windriver.com>
> -----Original Message-----
> From: Robert Yang <liezhi.yang@windriver.com>
> Sent: den 27 augusti 2022 05:37
> To: Peter Kjellerstedt <peter.kjellerstedt@axis.com>; bitbake-devel@lists.openembedded.org
> Subject: Re: [bitbake-devel] [RFC][PATCH V2] bitbake: fetch2/git: Use git fetch to shallow clone revisions
>
> Hi Peter,
>
> On 8/26/22 22:21, Peter Kjellerstedt wrote:
> >> -----Original Message-----
> >> From: bitbake-devel@lists.openembedded.org <bitbake-devel@lists.openembedded.org> On Behalf Of Robert Yang
> >> Sent: den 26 augusti 2022 15:11
> >> To: bitbake-devel@lists.openembedded.org
> >> Subject: [bitbake-devel] [RFC][PATCH V2] bitbake: fetch2/git: Use git fetch to shallow clone revisions
> >>
> >> * V2
> >> Fixed typos in commit message
> >
> > Patch history should go after the --- below.
> >
> >> The "git clone --depth" only works for refs, doesn't support revisions, but
> >> "git fetch --depth" supports revisions, so use it to do the shallow clone, the
> >> idea is from "git clone --recurse-submodules --shallow-submodules".
> >>
> >> The workflow is (Only enabled when BB_GIT_SHALLOW = "1"):
> >> $ git init --bare <clonedir>
> >> $ git remote add origin <url>
> >> $ git fetch origin --depth <depth> revision
> >> $ git branch <branchname> FETCH_HEAD
> >> $ git tag v<branchname> FETCH_HEAD
> >>
> >> Here is the testing data based on poky, the testing server has a very good
> >> network bandwidth:
> >>
> >> Add 'BB_GIT_SHALLOW = "1"' conf/local.conf
> >> $ rm -fr tmp downloads # Fresh download for each build
> >> $ time bitbake world --runall=fetch
> >> $ du -sh downloads/git2/
> >>
> >> Full Shallow Saved
> >> --------------------------------------
> >> Time: 15m59s 2m31s 84% (13m28s)
> >> Size: 12G 1.2G 90% (10.8G)
> >>
> >> * The Size is for downloads/git2/, the tarballs are not counted.
> >>
> >> We can see that it saves a lot of download time and disk space, for
> >> example:
> >>
> >> linux-yocto: 2.8G -> 228M
> >> llvm: 2.5G -> 171M
> >> cryptography: 1.5G -> 35M
> >>
> >> And "$ bitbake world" works well.
> >>
> >> This a RFC patch, please feel free to give you comments.
> >>
> >> Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
> >> ---
> >> bitbake/lib/bb/fetch2/git.py | 83 ++++++++++++++++++++++++++++--------
> >> 1 file changed, 66 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> >> index 4534bd75800..57bb61d5ee1 100644
> >> --- a/bitbake/lib/bb/fetch2/git.py
> >> +++ b/bitbake/lib/bb/fetch2/git.py
> >> @@ -244,6 +244,7 @@ class Git(FetchMethod):
> >> ud.unresolvedrev[name] = 'HEAD'
> >>
> >> ud.basecmd = d.getVar("FETCHCMD_git") or "git -c core.fsyncobjectfiles=0 -c gc.autoDetach=false -c core.pager=cat"
> >> + ud.basecmd = "LANG=C %s" % ud.basecmd
> >>
> >> write_tarballs = d.getVar("BB_GENERATE_MIRROR_TARBALLS") or "0"
> >> ud.write_tarballs = write_tarballs != "0" or ud.rebaseable
> >> @@ -344,6 +345,49 @@ class Git(FetchMethod):
> >> return False
> >> return True
> >>
> >> + def shallow_clone_by_fetch(self, ud, repourl, d):
> >> + """
> >> + Use "git fetch --depth <depth> revision" to implement shallow clone
> >> + since git can't clone a revision, a better solution should be:
> >> + "git fetch --depth <depth> revision:<branchname>" but it doesn't work
> >> + when revision is a tag, e.g.:
> >> + error: cannot update ref 'refs/heads/master': trying to write
> >> + non-commit object <revision> to branch 'refs/heads/master'
> >> + """
> >> +
> >> + import datetime
> >> +
> >> + depth = ud.shallow_depths[ud.names[0]]
> >> + revision = ud.revisions[ud.names[0]]
> >> + branchname = ud.branches[ud.names[0]]
> >> + if not branchname:
> >> + branchname = "master"
> >> +
> >> + # Rename branchname if it exists which can:
> >> + # - Avoid conflicts during update
> >> + # - Keep the revision on a branch so that "git submodule update --recursive"
> >> + # can work since it requires the revision on a branch.
> >> + branch_path = os.path.join(ud.clonedir, 'refs/heads/%s' % branchname)
> >> + if os.path.exists(branch_path):
> >> + os.rename(branch_path, '%s.%s' % (branch_path, datetime.datetime.now().strftime("%Y%m%d%H%M%S")))
> >
> > Any reason this is done using os.rename() rather than `git branch -m?
>
> It is because this is simpler and to keep align with branch_path, otherwise, we
> need:
> - git branch --list to get the branch list and split them by '\n', remove the star.
> - Check branch in the list
> - git branch -m to rename the branch
If you accept that the command can fail, then you do not need to list
the branches. Just do the rename. If the branch exists, then the
rename will succeed, otherwise it will fail, but that is expected and
ignored.
What I do not like about the use of os.rename() here is that it uses
internal knowledge of how Git stores its data.
//Peter
next prev parent reply other threads:[~2022-08-29 10:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-26 13:10 [RFC][PATCH V2] bitbake: fetch2/git: Use git fetch to shallow clone revisions Robert Yang
2022-08-26 14:21 ` [bitbake-devel] " Peter Kjellerstedt
2022-08-27 3:36 ` Robert Yang
2022-08-29 10:46 ` Peter Kjellerstedt [this message]
2022-08-31 3:10 ` Robert Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d9b06a8c3f874ee386fb80d0ada7b24e@axis.com \
--to=peter.kjellerstedt@axis.com \
--cc=bitbake-devel@lists.openembedded.org \
--cc=liezhi.yang@windriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.