From: Nicolas Pitre <nico@cam.org>
To: "Robin H. Johnson" <robbat2@gentoo.org>
Cc: git@vger.kernel.org
Subject: Re: Weird growth in packfile during initial push
Date: Wed, 15 Apr 2009 15:51:40 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.2.00.0904151443030.6741@xanadu.home> (raw)
In-Reply-To: <20090415182754.GF23644@curie-int>
On Wed, 15 Apr 2009, Robin H. Johnson wrote:
> I was doing a more recent conversion of the Gentoo repo, and ran into
> some odd behavior in the packfile size.
>
> For anybody else following the repo, you can now get it on the new hardware at:
> http://git-exp.overlays.gentoo.org/gitweb/?p=exp/gentoo-x86.git;a=summary
>
> I did the conversion with cvs2svn, packed, added the remote and pushed, only to
> find that the pack on the remote side suddenly seemed to be ~60MiB larger.
Hmmm.
> $ ls -la /tmp/convert/gentoo-x86-cvs2git/.git/objects/pack
> total 903804
> drwxr-xr-x 2 robbat2 users 119 Apr 14 08:05 .
> drwxr-xr-x 4 robbat2 users 28 Apr 14 08:05 ..
> -r--r--r-- 1 robbat2 users 139155472 Apr 14 08:05 pack-f805bb448f864becfeac9c7f8a8ac2ef90c26787.idx
> -r--r--r-- 1 robbat2 users 786336481 Apr 14 08:05 pack-f805bb448f864becfeac9c7f8a8ac2ef90c26787.pack
>
> $ git remote add origin git+ssh://git@git-exp.overlays.gentoo.org/exp/gentoo-x86.git
> $ git push origin master:master
> Initialized empty Git repository in /var/gitroot/exp/gentoo-x86.git/
> Counting objects: 4969800, done.
> Delta compression using up to 8 threads.
> Compressing objects: 100% (1217809/1217809), done.
> Writing objects: 100% (4969800/4969800), 810.56 MiB | 21608 KiB/s, done.
> Total 4969800 (delta 3735812), reused 4969800 (delta 3735812)
Here we know for sure that all objects were directly reused, so no
attempt at recompressing them was done. The only thing that
pack-objects might do in this case in addition to directly streaming the
existing pack is to convert delta object headers from OFS_DELTA to
REF_DELTA.
> $ ls -la /var/gitroot/exp/gentoo-x86.git/objects/pack
> total 966876
> drwxr-xr-x 2 git git 4096 Apr 14 08:43 .
> drwxr-xr-x 4 git git 4096 Apr 14 08:35 ..
> -r--r--r-- 1 git git 139155472 Apr 14 08:43 pack-f805bb448f864becfeac9c7f8a8ac2ef90c26787.idx
> -r--r--r-- 1 git git 849936308 Apr 14 08:43 pack-f805bb448f864becfeac9c7f8a8ac2ef90c26787.pack
Let's see if my theory stands:
849936308 - 786336481 = 63599827
63599827 / 3735812 = 17.02
Hence an average difference of 17 bytes per delta. Given that REF_DELTA
objects have a 20-byte SHA1 base reference which is replaced with a
variable length encoding of a pack offset in the OFS_DELTA case, we're
talking about 2.98 bytes for that offset encoding which feels about
right.
[...]
And the code matches this theory as well. Can you try this patch if you
have a chance?
diff --git a/builtin-send-pack.c b/builtin-send-pack.c
index 91c3651..e41adbf 100644
--- a/builtin-send-pack.c
+++ b/builtin-send-pack.c
@@ -44,12 +44,16 @@ static int pack_objects(int fd, struct ref *refs, struct extra_have_objects *ext
"--stdout",
NULL,
NULL,
+ NULL,
};
struct child_process po;
int i;
+ i = 4;
if (args->use_thin_pack)
- argv[4] = "--thin";
+ argv[i++] = "--thin";
+ if (args->use_ofs_delta)
+ argv[i++] = "--delta-base-offset";
memset(&po, 0, sizeof(po));
po.argv = argv;
po.in = -1;
@@ -316,6 +320,8 @@ int send_pack(struct send_pack_args *args,
ask_for_status_report = 1;
if (server_supports("delete-refs"))
allow_deleting_refs = 1;
+ if (server_supports("ofs-delta"))
+ args->use_ofs_delta = 1;
if (!remote_refs) {
fprintf(stderr, "No refs in common and none specified; doing nothing.\n"
diff --git a/send-pack.h b/send-pack.h
index 83d76c7..1d7b1b3 100644
--- a/send-pack.h
+++ b/send-pack.h
@@ -6,6 +6,7 @@ struct send_pack_args {
send_mirror:1,
force_update:1,
use_thin_pack:1,
+ use_ofs_delta:1,
dry_run:1;
};
Nicolas
next prev parent reply other threads:[~2009-04-15 19:53 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-15 18:27 Weird growth in packfile during initial push Robin H. Johnson
2009-04-15 19:51 ` Nicolas Pitre [this message]
2009-04-29 23:57 ` Junio C Hamano
2009-04-30 2:52 ` Nicolas Pitre
2009-05-01 6:17 ` Robin H. Johnson
2009-05-01 20:56 ` [PATCH] allow OFS_DELTA objects during a push Nicolas Pitre
2009-05-01 23:49 ` Junio C Hamano
2009-05-02 0:01 ` Compatibility between git.git and jgit Shawn O. Pearce
2009-05-02 1:14 ` A Large Angry SCM
2009-05-02 1:39 ` Nicolas Pitre
2009-05-02 1:59 ` Shawn O. Pearce
2009-05-02 16:56 ` Ealdwulf Wuffinga
2009-05-02 1:40 ` Michael Witten
2009-05-02 0:24 ` [PATCH] allow OFS_DELTA objects during a push Nicolas Pitre
2009-05-04 22:11 ` Shawn O. Pearce
2009-05-04 22:30 ` Shawn O. Pearce
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.0904151443030.6741@xanadu.home \
--to=nico@cam.org \
--cc=git@vger.kernel.org \
--cc=robbat2@gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).