git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Martin Fick" <mfick@codeaurora.org>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 0/3] Commit cache
Date: Tue,  3 Apr 2012 13:55:06 +0700	[thread overview]
Message-ID: <1333436109-16526-1-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <53707c0a-3782-47a4-8a35-da7136ff4822@email.android.com>

On Tue, Apr 3, 2012 at 12:55 PM, Martin Fick <mfick@codeaurora.org> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
>>> we could even store the data in a separate file to
>>> retain indexv2 compatibility).
>>>
>>> So it's sort-of a cache, in that it's redundant with the actual data.
>>> But staleness and writing issues are a lot simpler, since it only
>>gets
>>> updated when we index the pack (and the pack index in general is a
>>> similar concept; we are "caching" the location of the object in the
>>> packfile, rather than doing a linear search to look it up each time).
>>
>>I think I have something like that, (generate a machine-friendly
>>commit cache per pack, staying in $GIT_DIR/objects/pack/ too). It's
>>separate cache staying in $GIT_DIR/objects/pack, just like pack-.idx
>>files. It does improve rev-list time, but I'd rather wait for packv4,
>>or at least be sure that packv4 will not come anytime soon, before
>>pushing the cache route.
>
> I would love to try those patches out if you have them?

There you go. Note that these patches are not of high quality. I did not even
run "make test". To create commit cache, simply run index-pack, e.g.

$ git repack -ad
$ git index-pack --stdin < .git/objects/pack/pack-XXX.pack

It will create two more files, pack-XXX.sha1 and pack-XXX.sidx. On
linux-2.6.git, "git rev-list --all --quiet HEAD" takes 1.9s with the
patches and 6.6s without. Disk usage:

total 531M
 56M pack-ab843186bdfb00956c1b1c0cdb4ed5e4aa3e549e.idx
460M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.pack
9.7M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.sha1
5.3M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.sidx

Nguyễn Thái Ngọc Duy (3):
  parse_commit_buffer: rename a confusing variable name
  Add commit cache to help speed up commit traversal
  Add parse_commit_for_rev() to take advantage of sha1-cache

 Makefile             |    3 +
 builtin/index-pack.c |  113 ++++++++++++++++++++++++++++++++++-
 builtin/reflog.c     |    2 +-
 cache.h              |    9 +++
 commit.c             |   46 +++++++++++---
 commit.h             |    1 +
 log-tree.c           |    2 +-
 pack-write.c         |   11 +++-
 pack.h               |    1 +
 revision.c           |   10 ++--
 sha1_cache.c         |  161 ++++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_cache.h         |    6 ++
 sha1_file.c          |   12 ++++-
 test-sha1-cache.c    |   19 ++++++
 upload-pack.c        |    2 +-
 walker.c             |    2 +-
 16 files changed, 377 insertions(+), 23 deletions(-)
 create mode 100644 sha1_cache.c
 create mode 100644 sha1_cache.h
 create mode 100644 test-sha1-cache.c

-- 
1.7.3.1.256.g2539c.dirty

  reply	other threads:[~2012-04-03  6:56 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-30  0:18 Git push performance problems with ~100K refs Martin Fick
2012-03-30  2:12 ` Junio C Hamano
2012-03-30  2:43   ` Martin Fick
2012-03-30  9:32     ` Jeff King
2012-03-30  9:40       ` Jeff King
2012-03-30 14:22         ` Martin Fick
2012-03-31 22:10         ` [PATCH 1/3] add mergesort() for linked lists René Scharfe
2012-04-05 19:17           ` Junio C Hamano
2012-04-08 20:32             ` René Scharfe
2012-04-09 18:26               ` Junio C Hamano
2012-04-11  6:19           ` Stephen Boyd
2012-04-11 16:44             ` Junio C Hamano
2012-03-31 22:10         ` [PATCH 2/3] commit: use mergesort() in commit_list_sort_by_date() René Scharfe
2012-03-31 22:11         ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() René Scharfe
2012-03-31 22:36           ` Martin Fick
2012-03-31 23:45           ` Junio C Hamano
2012-04-02 16:24           ` Martin Fick
2012-04-02 16:39             ` Shawn Pearce
2012-04-02 16:49               ` Martin Fick
2012-04-02 16:51                 ` Shawn Pearce
2012-04-02 20:37                   ` Jeff King
2012-04-02 20:51                     ` Jeff King
2012-04-02 23:16                     ` Martin Fick
2012-04-03  3:49                     ` Nguyen Thai Ngoc Duy
2012-04-03  5:55                       ` Martin Fick
2012-04-03  6:55                         ` Nguyễn Thái Ngọc Duy [this message]
2012-04-03  6:55                         ` [PATCH 1/3] parse_commit_buffer: rename a confusing variable name Nguyễn Thái Ngọc Duy
2012-04-03  6:55                         ` [PATCH 2/3] Add commit cache to help speed up commit traversal Nguyễn Thái Ngọc Duy
2012-04-03  6:55                         ` [PATCH 3/3] Add parse_commit_for_rev() to take advantage of sha1-cache Nguyễn Thái Ngọc Duy
2012-04-05 13:02                       ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() Nguyen Thai Ngoc Duy
2012-04-06 19:21                         ` Shawn Pearce
2012-04-07  4:20                           ` Nguyen Thai Ngoc Duy
2012-04-03  3:44                   ` Nguyen Thai Ngoc Duy
2012-04-02 20:14           ` Jeff King
2012-04-02 22:54             ` René Scharfe
2012-04-03  8:40               ` Jeff King
2012-04-03  9:19                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1333436109-16526-1-git-send-email-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=mfick@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).