git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Konstantin Tokarev <annulen@yandex.ru>
Cc: Derrick Stolee <stolee@gmail.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Inefficiency of partial shallow clone vs shallow clone + "old-style" sparse checkout
Date: Wed, 1 Apr 2020 07:44:46 -0400	[thread overview]
Message-ID: <20200401114446.GA1589184@coredump.intra.peff.net> (raw)
In-Reply-To: <8268671585700012@iva3-58091f505f14.qloud-c.yandex.net>

On Wed, Apr 01, 2020 at 04:49:20AM +0300, Konstantin Tokarev wrote:

> > Less efficient use of network bandwidth is one thing, but shallow clones are
> > also more CPU-intensive with the "counting objects" phase on the server. Your
> > link shares the following end-to-end timings:
> >
> > * Shallow-clone: 234s
> > * Partial clone: 286s
> > * Both(???): 1023s
> >
> > The data implies that by asking for both you actually got a full clone (4.1 GB).
> 
> No, this is still a partial clone, full clone takes more than 6 GB

I think that 4GB number is just because of the bug, though. With the fix
I showed earlier, doing clones of linux.git from a local repo yields:

  type       objects (in passes)      bytes  time
  ----       -----------------------  -----  ----
  shallow      71447 (  71447+  n/a)  188MB   23s
  blob:none  5260567 (5193557+67010)  870MB   99s
  both         71447 (   4437+67010)  188MB   37s

The object counts and sizes make sense. blob:none is still going to get
the whole history of commits and trees, which are substantial. The sizes
for "shallow" and "both" are the same, because the checkout is going to
grab all of the blobs from the tip commit, which were included in the
original "shallow" anyway. It does take longer, because they come in a
second followup fetch (though I'm surprised it's so _much_ slower).

So to me that implies that shallow is strictly better than partial if
you're just going to check out the full tip commit. But doing both
together opens up the possibility of narrowing the sparse checkout.
Doing:

  $ git clone --no-local --no-checkout --filter=blob:none --depth=1 \
      /path/to/linux sparse
  $ cd sparse
  $ git sparse-checkout set arch

fetches 20795 objects (4437+16357+1), consuming only 27MB.

-Peff

  reply	other threads:[~2020-04-01 11:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-27 21:08 Inefficiency of partial shallow clone vs shallow clone + "old-style" sparse checkout Konstantin Tokarev
2020-03-28 14:40 ` Jeff King
2020-03-28 16:58   ` Derrick Stolee
2020-03-31 21:46     ` Taylor Blau
2020-04-01 12:18       ` Jeff King
2020-03-31 22:10     ` Konstantin Tokarev
2020-03-31 22:23       ` Konstantin Tokarev
2020-04-01  0:09         ` Derrick Stolee
2020-04-01  1:49           ` Konstantin Tokarev
2020-04-01 11:44             ` Jeff King [this message]
2020-04-01 12:15   ` [PATCH] clone: use "quick" lookup while following tags Jeff King
2020-04-01 19:12     ` Konstantin Tokarev
2020-04-01 19:25       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200401114446.GA1589184@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=annulen@yandex.ru \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).