From: Thomas Gummerer <t.gummerer@gmail.com>
To: git@vger.kernel.org
Cc: trast@student.ethz.ch, gitster@pobox.com, pclouds@gmail.com,
mhagger@alum.mit.edu
Subject: [GSoC] Designing a faster index format - Progress report
Date: Wed, 23 May 2012 14:21:35 +0200 [thread overview]
Message-ID: <20120523122135.GA58204@tgummerer.unibz.it> (raw)
mhagger@alum.mit.edu, pclouds@gmail.com
Bcc:
Subject: [GSoC] Designing a new index format - Progress update
Reply-To:
As Thomas Rast suggested yesterday on IRC, I'll give you a quick
overview of the work that has already been done in my GSoC project.
== Work done in the past 5 weeks ==
- Definition of a tentative index file v5 format [1]. This differs
from the proposal in making it possible to bisect the directory
entries and file entries, to do a binary search. The exact bits
for each section were also defined. To further compress the index,
along with prefix compression, the stat data is hashed, since
it's only used for comparison, but the plain data is never used.
Thanks to Michael Haggerty, Nguyen Thai Ngoc Duy, Thomas Rast
and Robin Rosenberg for feedback.
- Prototype of a converter from the index format v2/v3 to the index
format v5. [2] The converter reads the index from a git repository,
can output parts of the index (header, index entries as in
git ls-files --debug, cache tree as in test-dump-cache-tree, or
the reuc data). Then it writes the v5 index file format to
.git/index-v5. Thanks to Michael Haggerty for the code review.
- Prototype of a reader for the new index file format. [3] The
reader has mainly the purpose to show the algorithm used to read
the index lexicographically sorted after the full name which is
required by the current internal memory format. Big thanks for
reviewing this code and giving me advice on refactoring goes
to Michael Haggerty.
== Outlook for the next week ==
- Start working on actual git code
- Read the header of the new format
[1] https://github.com/tgummerer/git/wiki/Index-file-format-v5
[2] https://github.com/tgummerer/git/blob/pythonprototype/git-convert-index.py
[3] https://github.com/tgummerer/git/blob/pythonprototype/git-read-index-v5.py
next reply other threads:[~2012-05-23 12:22 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-23 12:21 Thomas Gummerer [this message]
2012-05-24 20:01 ` [GSoC] Designing a faster index format - Progress report Thomas Rast
2012-05-24 20:57 ` Junio C Hamano
2012-05-25 11:31 ` Nguyen Thai Ngoc Duy
2012-05-25 20:15 ` Thomas Gummerer
2012-05-26 4:09 ` Nguyen Thai Ngoc Duy
2012-05-27 9:04 ` Thomas Gummerer
2012-05-27 9:27 ` Junio C Hamano
2012-05-27 12:23 ` Nguyen Thai Ngoc Duy
2012-05-28 8:26 ` Thomas Gummerer
2012-05-29 13:29 ` Thomas Rast
2012-05-29 13:43 ` Nguyen Thai Ngoc Duy
2012-05-29 18:33 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120523122135.GA58204@tgummerer.unibz.it \
--to=t.gummerer@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
--cc=pclouds@gmail.com \
--cc=trast@student.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).