From: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
To: git@vger.kernel.org
Cc: "Martin Fick" <mfick@codeaurora.org>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [PATCH 0/3] Commit cache
Date: Tue, 3 Apr 2012 13:55:06 +0700 [thread overview]
Message-ID: <1333436109-16526-1-git-send-email-pclouds@gmail.com> (raw)
In-Reply-To: <53707c0a-3782-47a4-8a35-da7136ff4822@email.android.com>
On Tue, Apr 3, 2012 at 12:55 PM, Martin Fick <mfick@codeaurora.org> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
>>> we could even store the data in a separate file to
>>> retain indexv2 compatibility).
>>>
>>> So it's sort-of a cache, in that it's redundant with the actual data.
>>> But staleness and writing issues are a lot simpler, since it only
>>gets
>>> updated when we index the pack (and the pack index in general is a
>>> similar concept; we are "caching" the location of the object in the
>>> packfile, rather than doing a linear search to look it up each time).
>>
>>I think I have something like that, (generate a machine-friendly
>>commit cache per pack, staying in $GIT_DIR/objects/pack/ too). It's
>>separate cache staying in $GIT_DIR/objects/pack, just like pack-.idx
>>files. It does improve rev-list time, but I'd rather wait for packv4,
>>or at least be sure that packv4 will not come anytime soon, before
>>pushing the cache route.
>
> I would love to try those patches out if you have them?
There you go. Note that these patches are not of high quality. I did not even
run "make test". To create commit cache, simply run index-pack, e.g.
$ git repack -ad
$ git index-pack --stdin < .git/objects/pack/pack-XXX.pack
It will create two more files, pack-XXX.sha1 and pack-XXX.sidx. On
linux-2.6.git, "git rev-list --all --quiet HEAD" takes 1.9s with the
patches and 6.6s without. Disk usage:
total 531M
56M pack-ab843186bdfb00956c1b1c0cdb4ed5e4aa3e549e.idx
460M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.pack
9.7M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.sha1
5.3M pack-ab843185bdfb00956c1b1c0cdb4ed5e4aa3e549e.sidx
Nguyễn Thái Ngọc Duy (3):
parse_commit_buffer: rename a confusing variable name
Add commit cache to help speed up commit traversal
Add parse_commit_for_rev() to take advantage of sha1-cache
Makefile | 3 +
builtin/index-pack.c | 113 ++++++++++++++++++++++++++++++++++-
builtin/reflog.c | 2 +-
cache.h | 9 +++
commit.c | 46 +++++++++++---
commit.h | 1 +
log-tree.c | 2 +-
pack-write.c | 11 +++-
pack.h | 1 +
revision.c | 10 ++--
sha1_cache.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++++++
sha1_cache.h | 6 ++
sha1_file.c | 12 ++++-
test-sha1-cache.c | 19 ++++++
upload-pack.c | 2 +-
walker.c | 2 +-
16 files changed, 377 insertions(+), 23 deletions(-)
create mode 100644 sha1_cache.c
create mode 100644 sha1_cache.h
create mode 100644 test-sha1-cache.c
--
1.7.3.1.256.g2539c.dirty
next prev parent reply other threads:[~2012-04-03 6:56 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-30 0:18 Git push performance problems with ~100K refs Martin Fick
2012-03-30 2:12 ` Junio C Hamano
2012-03-30 2:43 ` Martin Fick
2012-03-30 9:32 ` Jeff King
2012-03-30 9:40 ` Jeff King
2012-03-30 14:22 ` Martin Fick
2012-03-31 22:10 ` [PATCH 1/3] add mergesort() for linked lists René Scharfe
2012-04-05 19:17 ` Junio C Hamano
2012-04-08 20:32 ` René Scharfe
2012-04-09 18:26 ` Junio C Hamano
2012-04-11 6:19 ` Stephen Boyd
2012-04-11 16:44 ` Junio C Hamano
2012-03-31 22:10 ` [PATCH 2/3] commit: use mergesort() in commit_list_sort_by_date() René Scharfe
2012-03-31 22:11 ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() René Scharfe
2012-03-31 22:36 ` Martin Fick
2012-03-31 23:45 ` Junio C Hamano
2012-04-02 16:24 ` Martin Fick
2012-04-02 16:39 ` Shawn Pearce
2012-04-02 16:49 ` Martin Fick
2012-04-02 16:51 ` Shawn Pearce
2012-04-02 20:37 ` Jeff King
2012-04-02 20:51 ` Jeff King
2012-04-02 23:16 ` Martin Fick
2012-04-03 3:49 ` Nguyen Thai Ngoc Duy
2012-04-03 5:55 ` Martin Fick
2012-04-03 6:55 ` Nguyễn Thái Ngọc Duy [this message]
2012-04-03 6:55 ` [PATCH 1/3] parse_commit_buffer: rename a confusing variable name Nguyễn Thái Ngọc Duy
2012-04-03 6:55 ` [PATCH 2/3] Add commit cache to help speed up commit traversal Nguyễn Thái Ngọc Duy
2012-04-03 6:55 ` [PATCH 3/3] Add parse_commit_for_rev() to take advantage of sha1-cache Nguyễn Thái Ngọc Duy
2012-04-05 13:02 ` [PATCH 3/3] revision: insert unsorted, then sort in prepare_revision_walk() Nguyen Thai Ngoc Duy
2012-04-06 19:21 ` Shawn Pearce
2012-04-07 4:20 ` Nguyen Thai Ngoc Duy
2012-04-03 3:44 ` Nguyen Thai Ngoc Duy
2012-04-02 20:14 ` Jeff King
2012-04-02 22:54 ` René Scharfe
2012-04-03 8:40 ` Jeff King
2012-04-03 9:19 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1333436109-16526-1-git-send-email-pclouds@gmail.com \
--to=pclouds@gmail.com \
--cc=git@vger.kernel.org \
--cc=mfick@codeaurora.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).