All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Martin Fick" <mfick@codeaurora.org>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 0/3] Commit cache
Date: Tue,  3 Apr 2012 13:55:06 +0700	[thread overview]
Message-ID: <1333436109-16526-1-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <53707c0a-3782-47a4-8a35-da7136ff4822@email.android.com>

On Tue, Apr 3, 2012 at 12:55 PM, Martin Fick <mfick@codeaurora.org> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
>>> we could even store the data in a separate file to
>>> retain indexv2 compatibility).
>>>
>>> So it's sort-of a cache, in that it's redundant with the actual data.
>>> But staleness and writing issues are a lot simpler, since it only
>>gets
>>> updated when we index the pack (and the pack index in general is a
>>> similar concept; we are "caching" the location of the object in the
>>> packfile, rather than doing a linear search to look it up each time).
>>
>>I think I have something like that, (generate a machine-friendly
>>commit cache per pack, staying in $GIT_DIR/objects/pack/ too). It's
>>separate cache staying in $GIT_DIR/objects/pack, just like pack-.idx
>>files. It does improve rev-list time, but I'd rather wait for packv4,
>>or at least be sure that packv4 will not come anytime soon, before
>>pushing the cache route.
>
> I would love to try those patches out if you have them?

There you go. Note that these patches are not of high quality. I did not even
run "make test". To create commit cache, simply run index-pack, e.g.

$ git repack -ad
$ git index-pack --stdin < .git/objects/pack/pack-XXX.pack

It will create two more files, pack-XXX.sha1 and pack-XXX.sidx. On
linux-2.6.git, "git rev-list --all --quiet HEAD" takes 1.9s with the
patches and 6.6s without. Disk usage:

total 531M
 56M pack-ab843186bdfb00956c1b1c0cdb4ed5e4aa3e549e.idx
460M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.pack
9.7M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.sha1
5.3M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.sidx

Nguyễn Thái Ngọc Duy (3):
  parse_commit_buffer: rename a confusing variable name
  Add commit cache to help speed up commit traversal
  Add parse_commit_for_rev() to take advantage of sha1-cache

 Makefile             |    3 +
 builtin/index-pack.c |  113 ++++++++++++++++++++++++++++++++++-
 builtin/reflog.c     |    2 +-
 cache.h              |    9 +++
 commit.c             |   46 +++++++++++---
 commit.h             |    1 +
 log-tree.c           |    2 +-
 pack-write.c         |   11 +++-
 pack.h               |    1 +
 revision.c           |   10 ++--
 sha1_cache.c         |  161 ++++++++++++++++++++++++++++++++++++++++++++++++++
 sha1_cache.h         |    6 ++
 sha1_file.c          |   12 ++++-
 test-sha1-cache.c    |   19 ++++++
 upload-pack.c        |    2 +-
 walker.c             |    2 +-
 16 files changed, 377 insertions(+), 23 deletions(-)
 create mode 100644 sha1_cache.c
 create mode 100644 sha1_cache.h
 create mode 100644 test-sha1-cache.c

-- 
1.7.3.1.256.g2539c.dirty

  reply	other threads:[~2012-04-03  6:56 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-30  0:18 Git push performance problems with ~100K refs Martin Fick
2012-03-30  2:12 ` Junio C Hamano
2012-03-30  2:43   ` Martin Fick
2012-03-30  9:32     ` Jeff King
2012-03-30  9:40       ` Jeff King
2012-03-30 14:22         ` Martin Fick
2012-03-31 22:10         ` [PATCH 1/3] add mergesort() for linked lists René Scharfe
2012-04-05 19:17           ` Junio C Hamano
2012-04-08 20:32             ` René Scharfe
2012-04-09 18:26               ` Junio C Hamano
2012-04-11  6:19           ` Stephen Boyd
2012-04-11 16:44             ` Junio C Hamano
2012-03-31 22:10         ` [PATCH 2/3] commit: use mergesort() in commit_list_sort_by_date() René Scharfe
2012-03-31 22:11         ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() René Scharfe
2012-03-31 22:36           ` Martin Fick
2012-03-31 23:45           ` Junio C Hamano
2012-04-02 16:24           ` Martin Fick
2012-04-02 16:39             ` Shawn Pearce
2012-04-02 16:49               ` Martin Fick
2012-04-02 16:51                 ` Shawn Pearce
2012-04-02 20:37                   ` Jeff King
2012-04-02 20:51                     ` Jeff King
2012-04-02 23:16                     ` Martin Fick
2012-04-03  3:49                     ` Nguyen Thai Ngoc Duy
2012-04-03  5:55                       ` Martin Fick
2012-04-03  6:55                         ` Nguyễn Thái Ngọc Duy [this message]
2012-04-03  6:55                         ` [PATCH 1/3] parse_commit_buffer: rename a confusing variable name Nguyễn Thái Ngọc Duy
2012-04-03  6:55                         ` [PATCH 2/3] Add commit cache to help speed up commit traversal Nguyễn Thái Ngọc Duy
2012-04-03  6:55                         ` [PATCH 3/3] Add parse_commit_for_rev() to take advantage of sha1-cache Nguyễn Thái Ngọc Duy
2012-04-05 13:02                       ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() Nguyen Thai Ngoc Duy
2012-04-06 19:21                         ` Shawn Pearce
2012-04-07  4:20                           ` Nguyen Thai Ngoc Duy
2012-04-03  3:44                   ` Nguyen Thai Ngoc Duy
2012-04-02 20:14           ` Jeff King
2012-04-02 22:54             ` René Scharfe
2012-04-03  8:40               ` Jeff King
2012-04-03  9:19                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1333436109-16526-1-git-send-email-pclouds@gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=mfick@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.