From: Nicolas Pitre <nico@cam.org>
To: Junio C Hamano <junkio@cox.net>
Cc: Linus Torvalds <torvalds@osdl.org>, git@vger.kernel.org
Subject: Re: [PATCH] pack-objects: reuse data from existing pack.
Date: Wed, 15 Feb 2006 22:41:24 -0500 (EST) [thread overview]
Message-ID: <Pine.LNX.4.64.0602152226130.5606@localhost.localdomain> (raw)
In-Reply-To: <7vbqx8m62q.fsf@assigned-by-dhcp.cox.net>
On Wed, 15 Feb 2006, Junio C Hamano wrote:
> When generating a new pack, notice if we have already the wanted
> object in existing packs. If the object has a delitified
> representation, and its base object is also what we are going to
> pack, then reuse the existing deltified representation
> unconditionally, bypassing all the expensive find_deltas() and
> try_deltas() routines.
>
> Also, when writing out such deltified representation and
> undeltified representation, if a matching data already exists in
> an existing pack, just write it out without uncompressing &
> recompressing.
Great !
> Without this patch:
>
> $ git-rev-list --objects v1.0.0 >RL
> $ time git-pack-objects p <RL
>
> Generating pack...
> Done counting 12233 objects.
> Packing 12233 objects....................
> 60a88b3979df41e22d1edc3967095e897f720192
>
> real 0m32.751s
> user 0m27.090s
> sys 0m2.750s
>
> With this patch:
>
> $ git-rev-list --objects v1.0.0 >RL
> $ time ../git.junio/git-pack-objects q <RL
>
> Generating pack...
> Done counting 12233 objects.
> Packing 12233 objects.....................
> 60a88b3979df41e22d1edc3967095e897f720192
> Total 12233, written 12233, reused 12177
>
> real 0m4.007s
> user 0m3.360s
> sys 0m0.090s
>
> Signed-off-by: Junio C Hamano <junkio@cox.net>
>
> ---
>
> * This may depend on one cleanup patch I have not sent out, but
> I am so excited that I could not help sending this out first.
>
> Admittedly this is hot off the press, I have not had enough
> time to beat this too hard, but the resulting pack from the
> above passed unpack-objects, index-pack and verify-pack.
In fact, the resulting pack should be identical with or without this
patch, shouldn't it?
FYI: I have list of patches to produce even smaller (yet still
compatible) packs, or less dense ones but with much reduced CPU usage.
All depending on a new --speed argument to git-pack-objects. I've been
able to produce 15-20% smaller packs with the same depth and window
size, but taking twice as much CPU time to produce. Combined with your
patch, one could repack the object store with the maximum compression
even if it is expensive CPU wise, but any pull will benefit from it
afterwards with no additional cost.
I only need to find some time to finally clean and re-test those
patches...
Nicolas
next prev parent reply other threads:[~2006-02-16 3:41 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-15 8:39 [FYI] pack idx format Junio C Hamano
2006-02-15 11:16 ` Johannes Schindelin
2006-02-15 16:46 ` Nicolas Pitre
2006-02-16 1:58 ` Junio C Hamano
2006-02-16 1:43 ` [PATCH] pack-objects: reuse data from existing pack Junio C Hamano
2006-02-16 1:45 ` [PATCH] packed objects: minor cleanup Junio C Hamano
2006-02-16 3:41 ` Nicolas Pitre [this message]
2006-02-16 3:59 ` [PATCH] pack-objects: reuse data from existing pack Junio C Hamano
2006-02-16 3:55 ` Linus Torvalds
2006-02-16 4:07 ` Junio C Hamano
2006-02-16 8:32 ` Andreas Ericsson
2006-02-16 9:13 ` Junio C Hamano
2006-02-17 4:30 ` Junio C Hamano
2006-02-17 10:37 ` [PATCH] pack-objects: finishing touches Junio C Hamano
2006-02-18 6:50 ` [PATCH] pack-objects: avoid delta chains that are too long Junio C Hamano
2006-02-17 15:39 ` [PATCH] pack-objects: reuse data from existing pack Linus Torvalds
2006-02-17 18:18 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0602152226130.5606@localhost.localdomain \
--to=nico@cam.org \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).