git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: peff@peff.net, me@ttaylorr.com, jonathantanmy@google.com,
	jrnieder@gmail.com, Derrick Stolee <dstolee@microsoft.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch
Date: Wed, 19 Feb 2020 16:21:51 +0000	[thread overview]
Message-ID: <dbc1bdcae16f8b9941add514264b0fe04cda48c0.1582129312.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.556.git.1582129312.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

While testing partial clone, I noticed some odd behavior. I was testing
a way of running 'git init', followed by manually configuring the remote
for partial clone, and then running 'git fetch'. Astonishingly, I saw
the 'git fetch' process start asking the server for multiple rounds of
pack-file downloads! When tweaking the situation a little more, I
discovered that I could cause the remote to hang up with an error.

Add two tests that demonstrate these two issues.

In the first test, we find that when fetching with blob filters from
a repository that previously did not have any tags, the 'git fetch
--tags origin' command fails because the server sends "multiple
filter-specs cannot be combined".

In the second test, we see that a 'git fetch origin' request with
several ref updates results in multiple pack-file downloads. This must
be due to Git trying to fault-in the objects pointed by the refs. What
makes this matter particularly nasty is that this goes through the
do_oid_object_info_extended() method, so there are no "haves" in the
negotiation. This leads the remote to send every reachable commit and
tree from each new ref, providing a quadratic amount of data transfer!
This test is fixed if we revert 6462d5eb9a (fetch: remove
fetch_if_missing=0, 2019-11-05), but that revert causes other test
failures. The real fix will need more care.

The tests are ordered in this way because if I swap the test order the
tag test will succeed instead of fail. I believe this is because somehow
we need the srv.bare repo to not have any tags when we clone, but then
have tags in our next fetch.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t5616-partial-clone.sh | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index fea56cda6d3..ed2ef45c37a 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace
 '
 
+test_expect_failure 'verify fetch succeeds when asking for new tags' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
+	for i in I J K
+	do
+		test_commit -C src $i &&
+		git -C src branch $i
+	done &&
+	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
+	git -C tag-test fetch --tags origin
+'
+
+test_expect_failure 'verify fetch downloads only one pack when updating refs' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 2 pack-list &&
+	for i in A B C
+	do
+		test_commit -C src $i &&
+		git -C src branch $i
+	done &&
+	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
+	git -C pack-test fetch origin &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 3 pack-list
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
gitgitgadget


  reply	other threads:[~2020-02-19 16:21 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19 16:21 [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee via GitGitGadget
2020-02-19 16:21 ` Derrick Stolee via GitGitGadget [this message]
2020-02-19 18:38   ` [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch Eric Sunshine
2020-02-19 20:42     ` Derrick Stolee
2020-02-19 20:52   ` Junio C Hamano
2020-02-19 20:59     ` Eric Sunshine
2020-02-19 21:17       ` Junio C Hamano
2020-02-19 21:20         ` Derrick Stolee
2020-02-19 16:21 ` [PATCH 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
2020-02-19 18:10   ` Jonathan Tan
2020-02-19 21:10 ` [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee
2020-02-21 21:47 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2020-02-21 21:47   ` [PATCH v2 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
2020-02-21 21:47   ` [PATCH v2 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
2020-02-22 17:25   ` [PATCH v2 0/2] Document two partial clone bugs, fix one Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dbc1bdcae16f8b9941add514264b0fe04cda48c0.1582129312.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).