git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: git@vger.kernel.org
Cc: trast@student.ethz.ch, gitster@pobox.com, mhagger@alum.mit.edu,
	pclouds@gmail.com
Subject: [GSoC] Designing a faster index format - Progress report week 6
Date: Mon, 28 May 2012 23:44:01 +0200	[thread overview]
Message-ID: <20120528214401.GB6449@tgummerer> (raw)

Here is another quick overview of the current progress of my
Google Summer of Code project. I'm currently planning to do
those weekly, to keep you up to date with the progress.

== Work done in the previous 5 weeks ==

- Definition of a tentative index file v5 format [1]. This differs
  from the proposal in making it possible to bisect the directory
  entries and file entries, to do a binary search. The exact bits
  for each section were also defined. To further compress the index,
  along with prefix compression, the stat data is hashed, since
  it's only used for comparison, but the plain data is never used.
  Thanks to Michael Haggerty, Nguyen Thai Ngoc Duy, Thomas Rast
  and Robin Rosenberg for feedback.
- Prototype of a converter from the index format v2/v3 to the index
  format v5. [2] The converter reads the index from a git repository,
  can output parts of the index (header, index entries as in
  git ls-files --debug, cache tree as in test-dump-cache-tree, or
  the reuc data). Then it writes the v5 index file format to
  .git/index-v5. Thanks to Michael Haggerty for the code review.
- Prototype of a reader for the new index file format. [3] The
  reader has mainly the purpose to show the algorithm used to read
  the index lexicographically sorted after the full name which is
  required by the current internal memory format. Big thanks for
  reviewing this code and giving me advice on refactoring goes
  to Michael Haggerty.

== Work done in the last week ==

- Started working on the actual git code. Git is now able to read
  the index format v5, although the mapping of the new ondisk
  format to the internal format is not done yet. The latest code
  is not pushed to github yet, since it still needs some polishing,
  but if anyone is interested in the general direction it's going,
  the initial steps are on github. [4] Thanks for reviewing the
  first steps to Thomas Rast.

== Outlook for the next week ==

- Refactoring of the read_index_v5 code, and possibly mapping of
  the read index to the current internal format (If there's enough
  time)
- Make git ls-files read the new index format directly, to show
  some of its advantages. (Thanks to Nguyen Thai Ngoc Duy, and
  Junio C Hamano for the suggestion)

[1] https://github.com/tgummerer/git/wiki/Index-file-format-v5
[2] https://github.com/tgummerer/git/blob/pythonprototype/git-convert-index.py
[3] https://github.com/tgummerer/git/blob/pythonprototype/git-read-index-v5.py
[4] https://github.com/tgummerer/git/tree/index-v5

             reply	other threads:[~2012-05-28 21:44 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-28 21:44 Thomas Gummerer [this message]
2012-05-31 15:50 ` Review of current github code [Re: [GSoC] Designing a faster index format - Progress report week 6] Thomas Rast
2012-05-31 18:11   ` Junio C Hamano
2012-06-01 12:11     ` Thomas Rast
2012-06-01 14:49   ` Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120528214401.GB6449@tgummerer \
    --to=t.gummerer@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=pclouds@gmail.com \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).