git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Rast <trast@student.ethz.ch>
To: Junio C Hamano <gitster@pobox.com>
Cc: Thomas Gummerer <t.gummerer@gmail.com>, <git@vger.kernel.org>,
	<trast@student.ethz.ch>, <mhagger@alum.mit.edu>, <peff@peff.net>,
	<spearce@spearce.org>, <davidbarr@google.com>
Subject: Re: Index format v5
Date: Thu, 3 May 2012 21:11:38 +0200	[thread overview]
Message-ID: <8762cdm651.fsf@thomas.inf.ethz.ch> (raw)
In-Reply-To: <7vd36lf634.fsf@alter.siamese.dyndns.org> (Junio C. Hamano's message of "Thu, 03 May 2012 11:54:39 -0700")

Junio C Hamano <gitster@pobox.com> writes:

>> [6] The length of the file name was dropped, since each file name is
>>     nul terminated anyway.
>
> This is micronit, but I think we do this to save one strlen() for each
> read of the entry, except for unusually long paths where we fall back to
> strlen(). A change like this needs to be justified better than simply
> saying "because we _could_ compute in a different way by spending extra
> cycles".

(Partially also answering the whole "what are these offsets" confusion)

The bisectability has evolved to the point where we envision the
structure of a "flat list" (the directories, and the files in each
directory) to have the format

  offset to entry 1 [1]
  ....
  offset to entry n

  entry 1, consisting of:
    name, nul-terminated
    rest of data:
      for dirs: cache-tree sha1, offset to files, etc.
      for files: stat data, content sha1, flags, etc.

That makes bisection very easy: the offsets point at the start of each
string, so you just strcmp() and get on with it.

On the other hand, by the time you can look at the flags, it's too late
for the strlen() optimization anyway, so meh.  If you think it's
important, we can perhaps lay it out so the rest of the data goes
immediately after the pointer.  But so far it wasn't clear whether it's
fixed-size, or uses some smart compression scheme.  Tonight's edition
has a fixed length, so perhaps it would be preferable to keep the
length.


Footnotes: 
[1]  It probably doesn't matter whether this is relative to the position
of the offset, or absolute (in terms of file pointer).

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  reply	other threads:[~2012-05-03 19:11 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-03 17:25 Index format v5 Thomas Gummerer
2012-05-03 18:16 ` Thomas Rast
2012-05-03 19:03   ` Junio C Hamano
2012-05-04  7:12   ` Michael Haggerty
2012-05-07 22:18     ` Robin Rosenberg
2012-05-03 18:21 ` Ronan Keryell
2012-05-03 20:36   ` Thomas Gummerer
2012-05-03 18:54 ` Junio C Hamano
2012-05-03 19:11   ` Thomas Rast [this message]
2012-05-03 19:31   ` Thomas Rast
2012-05-03 19:32     ` Thomas Rast
2012-05-03 20:32       ` Junio C Hamano
2012-05-03 21:38   ` Thomas Gummerer
2012-05-07 18:57     ` Robin Rosenberg
2012-05-03 19:38 ` solo-git
2012-05-04 13:20 ` Nguyen Thai Ngoc Duy
2012-05-04 15:44   ` Thomas Gummerer
2012-05-04 13:25 ` Philip Oakley
2012-05-04 15:46   ` Junio C Hamano
2012-05-06 10:23 ` Nguyen Thai Ngoc Duy
2012-05-07 13:44   ` Thomas Gummerer
2012-05-06 16:49 ` Phil Hord
2012-05-07 13:08   ` Thomas Gummerer
2012-05-07 15:15 ` Michael Haggerty
2012-05-08 14:11   ` Thomas Gummerer
2012-05-08 14:25     ` Nguyen Thai Ngoc Duy
2012-05-08 14:34       ` Nguyen Thai Ngoc Duy
2012-05-10  6:53         ` Thomas Gummerer
2012-05-10 11:06           ` Nguyen Thai Ngoc Duy
2012-05-09  8:37     ` Michael Haggerty
2012-05-10 12:19       ` Thomas Gummerer
2012-05-10 18:17         ` Michael Haggerty
2012-05-11 17:12           ` Thomas Gummerer
2012-05-13 19:50             ` Michael Haggerty
2012-05-14 15:01               ` Thomas Gummerer
2012-05-14 21:08                 ` Michael Haggerty
2012-05-14 22:10                   ` Thomas Rast
2012-05-15  6:43                     ` Michael Haggerty
2012-05-15 13:49                   ` Thomas Gummerer
2012-05-15 15:02                     ` Michael Haggerty
2012-05-18 15:38                       ` Thomas Gummerer
2012-05-19 13:00                         ` Michael Haggerty
2012-05-21  7:45                           ` Thomas Gummerer
2012-05-16  5:01                     ` Michael Haggerty
2012-05-16 21:54                       ` Thomas Gummerer
2012-05-19  5:40                         ` Michael Haggerty
2012-05-21 20:30                           ` Thomas Gummerer
2012-05-13 21:01 ` Philip Oakley
2012-05-14 14:54   ` Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8762cdm651.fsf@thomas.inf.ethz.ch \
    --to=trast@student.ethz.ch \
    --cc=davidbarr@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    --cc=t.gummerer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).