Buildroot Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Ricardo Martincoski <ricardo.martincoski@gmail.com>
To: buildroot@busybox.net
Subject: [Buildroot] [PATCH 3/3] support/download/git: do not use git clone
Date: Fri, 18 Nov 2016 05:33:25 -0200	[thread overview]
Message-ID: <582eaec5b41f3_6dedc5f1985238a@ultri3.mail> (raw)
In-Reply-To: 95e1f68c-b162-68b1-8a10-7baa287f088c@mind.be

Arnout,

On Sat, Nov 05, 2016 at 11:19 PM, Arnout Vandecappelle wrote:

[snip]
>  Excellent commit message!

Thanks.

[snip]
>  I have a bunch of remarks below, but nothing that should stop the acceptance of
> this patch. So:
> 
> Acked-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
> Tested-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>

I won't carry these for v2 because there will be a bunch of improvements based
on your review ;)

> 
>  This is for next, obviously.
> 
[snip]
>> +# Save the temporary file inside the .git directory that will be deleted after the checkout is done.
> 
>  Don't forget to wrap at 80 columns.

Sorry. This line is way too long.

But just to be clear:
I think you mean to wrap to 80 as a rule of thumb (e.g. comments are easy to do)
and not as a hard requirement.
Right?

I ask this because I think breaking some lines with commands would make it
harder to read, e.g.
if _git fetch ${verbose} "${@}" --depth 1 "'${repo}'" "'${ref}:refs/buildroot/${cset}'" 2>&1; then

[snip]
>> +if grep "\<\(\|\(\|refs/\)\(heads\|tags\)/\)${cset}$" "${tmpf}" >/dev/null 2>&1; then
> 
>  This is where grep -E becomes useful :-)

Thanks. Much nicer:
if grep -E "\<(|(|refs/)(heads|tags)/)${cset}$" "${tmpf}" >/dev/null 2>&1; then

[snip]
>> +elif grep "^${cset}" "${tmpf}" >/dev/null 2>&1; then
>> +    printf "The desired version is a sha1\n"
>> +    if [ ${#cset} -lt 40 -a "1" != "$(awk "/^${cset}/{print \$1|\"sort -u | wc -l\"}" "${tmpf}")" ]; then
> 
>  I think the check for -lt 40 is redundant.

You are right.
But this line will be removed in v2 because it is not needed and not wanted.
It is not needed because unambiguous partial sha1 (of any commit) can be
checked out after the full fetch (but without optimized download, of course).
It is not wanted because, as I realized after your comment, the output of
ls-remote is a subset of all sha1s and can't be used to test a sha1 is not
ambiguous. We could end up checking out a wrong sha1 when we really need to let
git checkout fail.
BTW, an example of partial sha1 is in the current master, see commit 956fcc2.

> 
>  I think all the logic here could be made a bit simpler by saving to a temporary
> file again:
> 
> elif grep "^${cset}" "${tmpf}" > .git/matching_refs 2>/dev/null; then
>     printf ...
>     if "$(cut -f 1 .git/matching_refs | sort -u | wc -l) != 2; then
>         ...

Yes. Much simpler.
Actually this 'if' is removed in v2 (see above) but I will use temporary file
and cut instead of awk in v2 (see below).

> 
> 
>> +        printf "Ambiguous partial sha1\n"
>> +        awk "/^${cset}/{print \$1|\"sort -u | wc -l\"}" "${tmpf}"
> 
>  Here we should print the full lines, so without sort -u and wc. Sort is still
> useful though.

Indeed. But will be removed in v2.
If the partial sha1 is ambiguous, let git checkout fail. Git checkout fails even
when the sha1 of the commit is ambiguous to a sha1 of a tree.
BTW, the best real example I have so far between 2 refs is b3766 from
https://github.com/vim/vim.git .

> 
>> +        exit 1
>> +    fi
>> +    # Accept full or unambiguous partial sha1. A sha1 can be referenced by many names.
>> +    # Prefer sha1 of commit pointed by a tag or sha1 of the tag itself,
>> +    # then sha1 pointed by any ref/*, and only then sha1 pointed by *HEAD.
> 
>  I don't understand why this bit is needed. Any of the refs will do, right? It's
> true that there is a risk that the ref gets updated between the ls-remote and
> the fetch, and that for tags the chance that it gets updated is smaller than for
> e.g. HEAD. But still, this is very much a corner case, and if it _does_ happen,
> we still fall back to full clone. So the only important thing is to avoid HEAD.
> Since ls-remotes already seems to sort the refs, which typically puts tags near
> the end, just take the last one and we're done:
> 
>     ref="$(cut -f 2 .git/matching_refs | tail -1)"

Yes. It is simpler.

I just noticed I missed to explain the awk -F'[\t^]' in the comments (or in the
commit log) :/ .
Its is needed because annotated tags have their own sha1 and the commit pointed
by a tag has ^{} at the end of the ref name in the ls-remote output.
9395452b4aab7bc2475ef8935b4a4fb99d778d70        refs/tags/v4.8-rc6^{}

But after 'cut' I can use bash's 'Pattern substitution'
ref=${ref/%"^{}"}
and, of course, add a nice comment for it.

[snip]
>> +if [ ${do_a_shallow_fetch} -eq 1 ]; then
> 
>  Actually, if [ "$ref" ] would be sufficient.

OK.

[snip]
>> -if ! _git fetch origin "'${cset}:${cset}'" >/dev/null 2>&1; then
>> -    printf "Could not fetch special ref '%s'; assuming it is not special.\n" "${cset}"
>> +git_checkout_done=0
>> +if [ ${git_fetch_done} -eq 1 ]; then
>> +    if _git checkout -q "'refs/buildroot/${cset}'" 2>&1; then
> 
>  I'm not sure if I like this split. git_fetch_done gets set in only one place,
> so we could just do the checkout directly there. That would also make it more
> explicit that this is a checkout specifically for shallow fetches, while the
> checkout below is specifically for full fetches.

OK.

[snip]
>> +    _git remote add origin "'${repo}'"
>> +    _git fetch ${verbose} "${@}" origin
> 
>  Why do we need to create a remote here?

Actually, we don't.
I missed your comment in http://patchwork.ozlabs.org/patch/681841/
"just use $repo everywhere"
I will do this in v2.

I think I can do this too:
"Do it in two patches: first convert the current situation to fetches, then add
the shallow download of tag SHAs."

> 
>> +    _git checkout -q "'${cset}'"
>> +fi
> 
>  At a later stage, we could update the fetch with --submodules=yes/no depending
> on -r option.

In theory, OK. But because of a bug in git maybe it's better to keep the
current 'git submodule update --init --recursive'.

git fetch --help # git 2.10.2
BUGS
       Using --recurse-submodules can only fetch new commits in already
       checked out submodules right now. When e.g. upstream added a new
       submodule in the just fetched commits of the superproject the submodule
       itself can not be fetched, making it impossible to check out that
       submodule later without having to do a fetch again. This is expected to
       be fixed in a future Git version.

> 
>  Another interesting feature would be to first try a shallow fetch of all tags
> and branches with a depth of, say, 100. That is still way less than a full clone
> for large repositories, and has a good chance to still capture a sha1 that is
> not at a branch or tag head. So it would provide a nice middle ground for the
> cases for which we fail to do a shallow clone now. And if it does fail, nothing
> is wasted because the full clone can start from the part that was already
> downloaded.

Yes. It should be useful especially for hardware-specific kernel repos when
using sha1 as custom repo version.
The buildroot policy is to use preferably a tag, and if a tag cannot be used,
use the sha1 in a branch, right?
So I think we will have the best gain by fetching only from branches in this
case.

A quick test shows good results: make warp7_defconfig ; make linux-source
without this feature:       1.69 GiB
                            5148191
with this feature (100):    945.59 MiB + 73.89 KiB
                            1997387 + 138 = 1997525
with this feature (10+100): 402.04 MiB + 470.10 MiB
                            347225 + 32 + 1650245 = 1997502
(I didn't dig to understand why the sum is not exactly the same)

Notice there is some extra processing in the server side ('Compressing objects')
that takes some more seconds but I guess the bandwidth saved worth this.

I can add this patch (depth 100) at the end of the series as a RFC.
This way people can comment/test/argue a different depth to use.

1) without this feature
Doing full clone
Cloning into 'linux-efaf36531fe7b1fc15a48033e5972825c91f9fc6'...
remote: Counting objects: 5148191, done.
remote: Compressing objects: 100% (395/395), done.
remote: Total 5148191 (delta 377), reused 183 (delta 183), pack-reused 5147613
Receiving objects: 100% (5148191/5148191), 1.69 GiB | 1.73 MiB/s, done.

2) with this feature, only for branches, depth 100
Doing shallow fetch of all branches
remote: Counting objects: 1997387, done.
remote: Compressing objects: 100% (453860/453860), done.
remote: Total 1997387 (delta 1605759), reused 1902796 (delta 1529294), pack-reused 0
Receiving objects: 100% (1997387/1997387), 945.59 MiB | 1.74 MiB/s, done.
...
remote: Counting objects: 138, done.
remote: Compressing objects: 100% (138/138), done.
remote: Total 138 (delta 0), reused 138 (delta 0), pack-reused 0
Receiving objects: 100% (138/138), 73.89 KiB | 0 bytes/s, done.

3) with this feature, only for branches, 1st step depth 10, 2nd step depth 100
Doing shallow fetch of all branches (10)
remote: Counting objects: 347225, done.
remote: Compressing objects: 100% (177196/177196), done.
remote: Total 347225 (delta 239120), reused 251422 (delta 167173), pack-reused 0
Receiving objects: 100% (347225/347225), 402.04 MiB | 1.56 MiB/s, done.
...
remote: Counting objects: 32, done.
remote: Compressing objects: 100% (32/32), done.
remote: Total 32 (delta 0), reused 32 (delta 0), pack-reused 0
Unpacking objects: 100% (32/32), done.
...
fatal: reference is not a tree: efaf36531fe7b1fc15a48033e5972825c91f9fc6
Doing shallow fetch of all branches (100)
remote: Counting objects: 1650245, done.
remote: Compressing objects: 100% (459302/459302), done.
remote: Total 1650245 (delta 1364443), reused 1452442 (delta 1179566), pack-reused 0
Receiving objects: 100% (1650245/1650245), 470.10 MiB | 1.68 MiB/s, done.
Resolving deltas: 100% (1364443/1364443), completed with 38986 local objects.
Checked out 'efaf36531fe7b1fc15a48033e5972825c91f9fc6'.

Regards,
Ricardo

  parent reply	other threads:[~2016-11-18  7:33 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-01 19:33 [Buildroot] [PATCH 1/3] support/download/git: log checked out sha1 Ricardo Martincoski
2016-11-01 19:33 ` [Buildroot] [PATCH 2/3] test/support/download/git: new test Ricardo Martincoski
2016-11-01 21:27   ` Ricardo Martincoski
2016-11-06  0:19   ` Arnout Vandecappelle
2016-11-06  0:51     ` Arnout Vandecappelle
2016-12-02  2:34     ` Ricardo Martincoski
2016-12-05 17:26       ` Arnout Vandecappelle
2016-12-06  1:35         ` Ricardo Martincoski
2016-11-06  1:24   ` Arnout Vandecappelle
2016-11-06 12:13     ` Henrique Marks
2016-11-06 15:34       ` Arnout Vandecappelle
2017-08-26 22:20   ` [Buildroot] [next v2 1/7] testing/infra/builder: split configure() from build() Ricardo Martincoski
2017-08-26 22:20     ` [Buildroot] [next v2 2/7] testing/infra/builder: build with target and environment Ricardo Martincoski
2017-10-06 20:42       ` Arnout Vandecappelle
2017-10-06 21:02         ` Arnout Vandecappelle
2017-10-23  2:34         ` Ricardo Martincoski
2017-08-26 22:20     ` [Buildroot] [next v2 3/7] testing/infra/builder: allow to override logfile Ricardo Martincoski
2017-10-06 20:50       ` Arnout Vandecappelle
2017-10-23  2:32         ` Ricardo Martincoski
2017-08-26 22:20     ` [Buildroot] [next v2 4/7] testing/tests/download: add infra for git tests Ricardo Martincoski
2017-10-06 21:30       ` Arnout Vandecappelle
2017-10-23  2:35         ` Ricardo Martincoski
2017-10-23  8:18           ` Arnout Vandecappelle
2017-10-29  4:00             ` Ricardo Martincoski
2017-08-26 22:20     ` [Buildroot] [next v2 5/7] testing/tests/download: add git hash tests Ricardo Martincoski
2017-08-26 22:38       ` Ricardo Martincoski
2017-08-27  4:00         ` Baruch Siach
2017-10-06 21:36       ` Arnout Vandecappelle
2017-10-06 21:44       ` Arnout Vandecappelle
2017-10-23  2:36         ` Ricardo Martincoski
2017-08-26 22:20     ` [Buildroot] [next v2 6/7] testing/tests/download: add test for sha1 as git ref Ricardo Martincoski
2017-10-06 21:57       ` Arnout Vandecappelle
2017-10-23  2:38         ` Ricardo Martincoski
2017-08-26 22:20     ` [Buildroot] [next v2 7/7] testing/tests/download: add test for git submodule Ricardo Martincoski
2017-10-06 20:31     ` [Buildroot] [next v2 1/7] testing/infra/builder: split configure() from build() Arnout Vandecappelle
2017-10-23  2:31       ` Ricardo Martincoski
2017-10-29 14:05     ` [Buildroot] [PATCH v3 0/9] tests for git download infra (series 1/3) Ricardo Martincoski
2017-10-29 14:06       ` [Buildroot] [PATCH v3 1/9] testing/infra/builder: call make with empty env Ricardo Martincoski
2018-04-01 17:58         ` Thomas Petazzoni
2017-10-29 14:06       ` [Buildroot] [PATCH v3 2/9] testing/infra/builder: split configure() from build() Ricardo Martincoski
2018-04-01 17:59         ` Thomas Petazzoni
2018-04-01 21:32           ` Ricardo Martincoski
2018-04-01 21:37             ` Thomas Petazzoni
2017-10-29 14:06       ` [Buildroot] [PATCH v3 3/9] testing/infra/builder: build with target and environment Ricardo Martincoski
2017-10-29 14:06       ` [Buildroot] [PATCH v3 4/9] testing/infra: split runtime test from BRTest Ricardo Martincoski
2017-10-29 14:06       ` [Buildroot] [PATCH v3 5/9] testing/infra/basetest: support br2-external Ricardo Martincoski
2017-10-29 14:06       ` [Buildroot] [PATCH v3 6/9] testing/tests/download: add git hash test Ricardo Martincoski
2017-10-29 14:06       ` [Buildroot] [PATCH v3 7/9] testing/tests/download: test case for git refs Ricardo Martincoski
2017-10-29 14:06       ` [Buildroot] [PATCH v3 8/9] testing/tests/download: test git branch Ricardo Martincoski
2017-10-29 14:06       ` [Buildroot] [PATCH v3 9/9] testing/tests/download: test git submodules Ricardo Martincoski
2018-04-25 20:58       ` [Buildroot] [PATCH v3 0/9] tests for git download infra (series 1/3) Ricardo Martincoski
2018-04-29 14:33       ` [Buildroot] [PATCH v4] tests for git download infra v4 Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/infra/builder: build with target and environment Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/infra: split runtime test from BRTest Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/infra/basetest: support br2-external Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/tests/download: add git hash test Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/tests/download: test case for git refs Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/tests/download: test git branch Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/tests/download: test git submodules Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/tests/download: test git tag Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/tests/download: test git tag/branch precedence Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/tests/download: test git special ref Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [PATCH v4] testing/tests/download: test git branch with slash Ricardo Martincoski
2018-04-29 14:33         ` [Buildroot] [RFC PATCH v4] support/testing: test extra download with site method git Ricardo Martincoski
2018-04-30  1:38         ` [Buildroot] [PATCH v4] tests for git download infra v4 Ricardo Martincoski
2018-05-11  3:09         ` Ricardo Martincoski
2018-05-12  2:58         ` [Buildroot] [PATCH v5 00/10] tests for git download infra v5 Ricardo Martincoski
2018-05-12  2:58           ` [Buildroot] [PATCH v5 01/10] testing/infra/builder: build with target and environment Ricardo Martincoski
2018-05-12  2:58           ` [Buildroot] [PATCH v5 02/10] testing/infra: split runtime test from BRTest Ricardo Martincoski
2019-02-04 15:55             ` Arnout Vandecappelle
2019-02-04 18:19               ` Matthew Weber
2019-02-04 19:42                 ` Matthew Weber
2019-02-05  1:19                   ` Ricardo Martincoski
2019-02-05  8:29                     ` Arnout Vandecappelle
2019-02-05  1:00               ` Ricardo Martincoski
2019-02-05  1:12                 ` Matthew Weber
2019-02-05  1:47                   ` Ricardo Martincoski
2019-02-05  9:28                   ` Matthew Weber
2018-05-12  2:58           ` [Buildroot] [PATCH v5 03/10] testing/infra/basetest: support br2-external Ricardo Martincoski
2018-05-12  2:58           ` [Buildroot] [PATCH v5 04/10] testing/tests/download: add git hash test Ricardo Martincoski
2019-02-04 15:52             ` Arnout Vandecappelle
2019-02-05  0:52               ` Ricardo Martincoski
2019-02-05  8:09                 ` Arnout Vandecappelle
2019-02-05  8:55                   ` Peter Korsgaard
2018-05-12  2:58           ` [Buildroot] [PATCH v5 05/10] testing/tests/download: test case for git refs Ricardo Martincoski
2019-02-04 19:48             ` Arnout Vandecappelle
2019-02-05  0:53               ` Ricardo Martincoski
2018-05-12  2:58           ` [Buildroot] [PATCH v5 06/10] testing/tests/download: test git branch Ricardo Martincoski
2019-02-05  9:42             ` Arnout Vandecappelle
2018-05-12  2:58           ` [Buildroot] [PATCH v5 07/10] testing/tests/download: test git submodules Ricardo Martincoski
2019-02-05 10:03             ` Arnout Vandecappelle
2019-02-06  9:08               ` Arnout Vandecappelle
2019-02-06  9:14               ` Yann E. MORIN
2018-05-12  2:58           ` [Buildroot] [PATCH v5 08/10] testing/tests/download: test git tag Ricardo Martincoski
2019-02-06 10:03             ` Arnout Vandecappelle
2018-05-12  2:58           ` [Buildroot] [PATCH v5 09/10] testing/tests/download: test git special ref Ricardo Martincoski
2019-02-06 10:05             ` Arnout Vandecappelle
2019-02-18  2:46               ` Ricardo Martincoski
2019-02-19  9:01                 ` Arnout Vandecappelle
2019-02-26  3:01                   ` Ricardo Martincoski
2018-05-12  2:58           ` [Buildroot] [PATCH v5 10/10] support/testing: test extra download with site method git Ricardo Martincoski
2019-02-06 10:34             ` Arnout Vandecappelle
2016-11-01 19:33 ` [Buildroot] [PATCH 3/3] support/download/git: do not use git clone Ricardo Martincoski
2016-11-06  1:19   ` Arnout Vandecappelle
2016-11-10  0:07     ` Ricardo Martincoski
2016-11-18  7:33     ` Ricardo Martincoski [this message]
2016-11-05 21:50 ` [Buildroot] [PATCH 1/3] support/download/git: log checked out sha1 Arnout Vandecappelle
2016-11-06 23:17   ` Ricardo Martincoski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=582eaec5b41f3_6dedc5f1985238a@ultri3.mail \
    --to=ricardo.martincoski@gmail.com \
    --cc=buildroot@busybox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox