git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@fluxnic.net>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	"Shawn O. Pearce" <spearce@spearce.org>
Subject: Re: Does pack v4 do anything to commits?
Date: Sat, 18 Feb 2012 10:34:43 -0500 (EST)	[thread overview]
Message-ID: <alpine.LFD.2.02.1202181013350.24536@xanadu.home> (raw)
In-Reply-To: <CACsJy8CZPG3b3LBF-EFO_kOv6i157jy5414+bHcqiwOKyC+8VA@mail.gmail.com>

On Sat, 18 Feb 2012, Nguyen Thai Ngoc Duy wrote:

> Hi Nico,
> 
> I had an experiment on speeding up "rev-list --all". If I cache sha-1
> of tree and parent, and committer date of single-parent commits, in
> binary form, rev-list can be sped up significantly. On linux-2.6.git,
> it goes from 14s to 4s (2s to 0.8 for git.git). Profiling shows that
> commit parsing (get_sha1_hex, parse_commit_date) dominates rev-list
> time.
> 
> >From what I remember, pack v4 is mainly about changing tree
> representation so that we can traverse object DAG as fast as possible.
> Do you do anything to commit representation too? Maybe it's worth
> storing the above info along with the compressed commit objects in
> pack to shave some more seconds.

Both the tree and commit object representations are completely changed 
to evacuate SHA1 parsing and searching entirely.  The SHA1 references 
are uncompressed, and replaced by indices into the pack for direct 
lookup without any binary search.  And the commit dates are stored in 
binary form. All path names as well as author/committer names are 
factored out into a table as well. This should make history traversal 
operations almost as fast as walking a linked list in memory, while 
making the actual pack size smaller at the same time.

> By the way, is latest packv4 code available somewhere to fetch?

Well, not yet.  Incidentally, I'm going in the Caribbeans for a week in 
a week, with no kids and only my wife who is going to be busy with scuba 
diving activities.  Like I did last year, I'm going to take some time to 
pursue my work on Pack v4 during that time.  And I intend to publish it 
when I come back, whatever state it is in, so someone else can complete 
the work eventually (I have too much to do to spend significant time on 
Git these days).


Nicolas

      reply	other threads:[~2012-02-18 15:34 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-18  4:44 Does pack v4 do anything to commits? Nguyen Thai Ngoc Duy
2012-02-18 15:34 ` Nicolas Pitre [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.02.1202181013350.24536@xanadu.home \
    --to=nico@fluxnic.net \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).