From: Jeff King <peff@peff.net>
To: Konstantin Tokarev <annulen@yandex.ru>
Cc: Derrick Stolee <stolee@gmail.com>,
Jonathan Tan <jonathantanmy@google.com>,
git@vger.kernel.org
Subject: [PATCH] clone: use "quick" lookup while following tags
Date: Wed, 1 Apr 2020 08:15:37 -0400 [thread overview]
Message-ID: <20200401121537.GA1916590@coredump.intra.peff.net> (raw)
In-Reply-To: <20200328144023.GB1198080@coredump.intra.peff.net>
On Sat, Mar 28, 2020 at 10:40:23AM -0400, Jeff King wrote:
> So I guess the problem is not with shallow clones specifically, but they
> lead us to not having fetched the commits pointed to by tags, which
> leads to us trying to fault in those commits (and their trees) rather
> than realizing that we weren't meant to have them. And the size of the
> local repo balloons because you're fetching all those commits one by
> one, and not getting the benefit of the deltas you would when you do a
> single --filter=blob:none fetch.
>
> I guess we need something like this:
The issue is actually with --single-branch, which is implied by --depth.
But the fix is the same either way.
Here it is with a commit message and test.
-- >8 --
Subject: [PATCH] clone: use "quick" lookup while following tags
When cloning with --single-branch, we implement git-fetch's usual
tag-following behavior, grabbing any tag objects that point to objects
we have locally.
When we're a partial clone, though, our has_object_file() check will
actually lazy-fetch each tag. That not only defeats the purpose of
--single-branch, but it does it incredibly slowly, potentially kicking
off a new fetch for each tag. This is even worse for a shallow clone,
which implies --single-branch, because even tags which are supersets of
each other will be fetched individually.
We can fix this by passing OBJECT_INFO_SKIP_FETCH_OBJECT to the call,
which is what git-fetch does in this case.
Likewise, let's include OBJECT_INFO_QUICK, as that's what git-fetch
does. The rationale is discussed in 5827a03545 (fetch: use "quick"
has_sha1_file for tag following, 2016-10-13), but here the tradeoff
would apply even more so because clone is very unlikely to be racing
with another process repacking our newly-created repository.
This may provide a very small speedup even in the non-partial case case,
as we'd avoid calling reprepare_packed_git() for each tag (though in
practice, we'd only have a single packfile, so that reprepare should be
quite cheap).
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/clone.c | 4 +++-
t/t5616-partial-clone.sh | 8 ++++++++
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index d8b1f413aa..9da6459f1d 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -643,7 +643,9 @@ static void write_followtags(const struct ref *refs, const char *msg)
continue;
if (ends_with(ref->name, "^{}"))
continue;
- if (!has_object_file(&ref->old_oid))
+ if (!has_object_file_with_flags(&ref->old_oid,
+ OBJECT_INFO_QUICK |
+ OBJECT_INFO_SKIP_FETCH_OBJECT))
continue;
update_ref(msg, ref->name, &ref->old_oid, NULL, 0,
UPDATE_REFS_DIE_ON_ERR);
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 77bb91e976..8f0d81a27e 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -415,6 +415,14 @@ test_expect_success 'verify fetch downloads only one pack when updating refs' '
test_line_count = 3 pack-list
'
+test_expect_success 'single-branch tag following respects partial clone' '
+ git clone --single-branch -b B --filter=blob:none \
+ "file://$(pwd)/srv.bare" single &&
+ git -C single rev-parse --verify refs/tags/B &&
+ git -C single rev-parse --verify refs/tags/A &&
+ test_must_fail git -C single rev-parse --verify refs/tags/C
+'
+
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
--
2.26.0.408.gebd8a4413c
next prev parent reply other threads:[~2020-04-01 12:15 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-27 21:08 Inefficiency of partial shallow clone vs shallow clone + "old-style" sparse checkout Konstantin Tokarev
2020-03-28 14:40 ` Jeff King
2020-03-28 16:58 ` Derrick Stolee
2020-03-31 21:46 ` Taylor Blau
2020-04-01 12:18 ` Jeff King
2020-03-31 22:10 ` Konstantin Tokarev
2020-03-31 22:23 ` Konstantin Tokarev
2020-04-01 0:09 ` Derrick Stolee
2020-04-01 1:49 ` Konstantin Tokarev
2020-04-01 11:44 ` Jeff King
2020-04-01 12:15 ` Jeff King [this message]
2020-04-01 19:12 ` [PATCH] clone: use "quick" lookup while following tags Konstantin Tokarev
2020-04-01 19:25 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200401121537.GA1916590@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=annulen@yandex.ru \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).