git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: peff@peff.net, me@ttaylorr.com, jonathantanmy@google.com,
	jrnieder@gmail.com, sunshine@sunshineco.com,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH v2 0/2] Document two partial clone bugs, fix one
Date: Fri, 21 Feb 2020 21:47:26 +0000	[thread overview]
Message-ID: <pull.556.v2.git.1582321648.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.556.git.1582129312.gitgitgadget@gmail.com>

While playing with partial clone, I discovered a few bugs and document them
with tests in patch 1. One seems to be a server-side bug that happens in a
somewhat rare situation, but not terribly unlikely. The other is a
client-side bug that leads to quadratic amounts of data transfer; I fix this
bug in patch 2.

UPDATES in V2:

 * Added "|| return 1" inside the for loops.
   
   
 * Added an in-test comment about the test ordering.
   
   
 * Required protocol.version=2 in the tags test due to the bisect Junio
   performed.
   
   
 * Updated the commit message via Jonathan Tan's suggestion.
   
   

You can ignore the stack traces I sent earlier, as those seem to be from
states I cannot get into without being destructive to my .git directory.

Thanks, -Stolee

Derrick Stolee (2):
  partial-clone: demonstrate bugs in partial fetch
  partial-clone: avoid fetching when looking for objects

 builtin/fetch.c          | 10 +++++-----
 t/t5616-partial-clone.sh | 31 +++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 5 deletions(-)


base-commit: d0654dc308b0ba76dd8ed7bbb33c8d8f7aacd783
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-556%2Fderrickstolee%2Fpartial-clone-fix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-556/derrickstolee/partial-clone-fix-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/556

Range-diff vs v1:

 1:  dbc1bdcae16 ! 1:  04965a8c7a4 partial-clone: demonstrate bugs in partial fetch
     @@ -14,7 +14,8 @@
          In the first test, we find that when fetching with blob filters from
          a repository that previously did not have any tags, the 'git fetch
          --tags origin' command fails because the server sends "multiple
     -    filter-specs cannot be combined".
     +    filter-specs cannot be combined". This only happens when using
     +    protocol v2.
      
          In the second test, we see that a 'git fetch origin' request with
          several ref updates results in multiple pack-file downloads. This must
     @@ -41,15 +42,20 @@
       	grep "want $(cat hash)" trace
       '
       
     ++# The following two tests must be in this order, or else
     ++# the first will not fail. It is important that the srv.bare
     ++# repository did not have tags during clone, but has tags
     ++# in the fetch.
     ++
      +test_expect_failure 'verify fetch succeeds when asking for new tags' '
      +	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
      +	for i in I J K
      +	do
      +		test_commit -C src $i &&
     -+		git -C src branch $i
     ++		git -C src branch $i || return 1
      +	done &&
      +	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
     -+	git -C tag-test fetch --tags origin
     ++	git -C tag-test -c protocol.version=2 fetch --tags origin
      +'
      +
      +test_expect_failure 'verify fetch downloads only one pack when updating refs' '
     @@ -59,7 +65,7 @@
      +	for i in A B C
      +	do
      +		test_commit -C src $i &&
     -+		git -C src branch $i
     ++		git -C src branch $i || return 1
      +	done &&
      +	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
      +	git -C pack-test fetch origin &&
 2:  937a882261d ! 2:  7c4c9f0f8e1 partial-clone: avoid fetching when looking for objects
     @@ -2,10 +2,13 @@
      
          partial-clone: avoid fetching when looking for objects
      
     -    When using partial-clone, do_oid_object_info_extended() can trigger a
     -    fetch for missing objects. This can be extremely expensive when asking
     -    for a tag or commit, as we are completely removed from the context of
     -    the missing object and thus supply no "haves" in the request.
     +    When using partial clone, find_non_local_tags() in builtin/fetch.c
     +    checks each remote tag to see if its object also exists locally. There
     +    is no expectation that the object exist locally, but this function
     +    nevertheless triggers a lazy fetch if the object does not exist. This
     +    can be extremely expensive when asking for a commit, as we are
     +    completely removed from the context of the non-existent object and
     +    thus supply no "haves" in the request.
      
          6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
          global variable that prevented these fetches in favor of a bitflag.
     @@ -68,7 +71,7 @@
       --- a/t/t5616-partial-clone.sh
       +++ b/t/t5616-partial-clone.sh
      @@
     - 	git -C tag-test fetch --tags origin
     + 	git -C tag-test -c protocol.version=2 fetch --tags origin
       '
       
      -test_expect_failure 'verify fetch downloads only one pack when updating refs' '

-- 
gitgitgadget

  parent reply	other threads:[~2020-02-21 21:47 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19 16:21 [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee via GitGitGadget
2020-02-19 16:21 ` [PATCH 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
2020-02-19 18:38   ` Eric Sunshine
2020-02-19 20:42     ` Derrick Stolee
2020-02-19 20:52   ` Junio C Hamano
2020-02-19 20:59     ` Eric Sunshine
2020-02-19 21:17       ` Junio C Hamano
2020-02-19 21:20         ` Derrick Stolee
2020-02-19 16:21 ` [PATCH 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
2020-02-19 18:10   ` Jonathan Tan
2020-02-19 21:10 ` [PATCH 0/2] Document two partial clone bugs, fix one Derrick Stolee
2020-02-21 21:47 ` Derrick Stolee via GitGitGadget [this message]
2020-02-21 21:47   ` [PATCH v2 1/2] partial-clone: demonstrate bugs in partial fetch Derrick Stolee via GitGitGadget
2020-02-21 21:47   ` [PATCH v2 2/2] partial-clone: avoid fetching when looking for objects Derrick Stolee via GitGitGadget
2020-02-22 17:25   ` [PATCH v2 0/2] Document two partial clone bugs, fix one Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.556.v2.git.1582321648.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).