git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scheibe" <rene.scheibe@gmail.com>
To: git@vger.kernel.org
Subject: How to speedup git clone for big binary files (disable delta compression)
Date: Thu, 19 Jul 2018 00:05:00 +0200	[thread overview]
Message-ID: <43b401ec-31fc-59dc-17c0-8dd7359726da@gmail.com> (raw)

Hi,

I was wondering why "git clone" seems to not respect "-delta" in .gitattributes.


*Reproduction*

I prepared a test repository with:

- git v2.17.1
- .gitattributes containing "*.bin binary -delta"
- 10 commits with a 10 MB random binary file

Code:
---------------------------------------------------------------------
#!/bin/bash

# setup repository
git init --quiet repo
cd repo

echo '*.bin binary -delta' > .gitattributes
git add .gitattributes
git commit --quiet -m 'attributes'

for i in $(seq 10); do
    dd if=/dev/urandom of=data.bin bs=1MB count=10 status=none
    git add data.bin
    git commit --quiet -m "data $i"
done
cd ..

# create clone repository
time git clone --no-local repo clone

# repack original repository
cd repo
time git repack -a -d
---------------------------------------------------------------------

Output:
---------------------------------------------------------------------
Cloning into 'clone'...
remote: Counting objects: 33, done.
remote: Compressing objects: 100% (31/31), done.
remote: Total 33 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (33/33), 95.40 MiB | 19.94 MiB/s, done.

real    0m25,085s
user    0m22,749s
sys     0m0,948s

Counting objects: 33, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (21/21), done.
Writing objects: 100% (33/33), done.
Total 33 (delta 0), reused 0 (delta 0)

real    0m5,652s
user    0m4,173s
sys     0m0,178s
---------------------------------------------------------------------


*Observations*

_time_

- Cloning: "clone" takes always 25s
- Optimizing: "repack" takes 25s with and 5s without delta compression

_compressed objects_

- Cloning: "clone" compresses always 31 objects
- Optimizing: "repack" compresses 31 objects with and 21 objects without delta compression


*Expectations*

Both operations ("repack" and "clone") are using "pack-objects".

Therefore my expectation is that "clone" should respect "-delta" and be about as fast as "repack".


Cheers,
  René

             reply	other threads:[~2018-07-18 22:05 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-18 22:05 René Scheibe [this message]
2018-07-19  5:33 ` How to speedup git clone for big binary files (disable delta compression) Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43b401ec-31fc-59dc-17c0-8dd7359726da@gmail.com \
    --to=rene.scheibe@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).