git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Olivier Galibert <galibert@pobox.com>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Junio C Hamano <gitster@pobox.com>,
	Alberto Bertogli <albertito@gmail.com>,
	git@vger.kernel.org, Johan Herland <johan@herland.net>
Subject: Re: [REVISED PATCH 2/6] Introduce commit notes
Date: Thu, 19 Jul 2007 10:50:37 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LFD.0.999.0707191043430.27353@woody.linux-foundation.org> (raw)
In-Reply-To: <20070719103436.GA9143@dspnet.fr.eu.org>



On Thu, 19 Jul 2007, Olivier Galibert wrote:

> On Wed, Jul 18, 2007 at 08:28:27PM -0700, Linus Torvalds wrote:
> > And yes, the "search for zero bytes" is not *guaranteed* to find any 
> > beginning at all, if you have lots of short names, *and* lots of zero 
> > bytes in the SHA1's. But while short names may be common, zero bytes in 
> > SHA1's are not so much (since you should expect to see a very even 
> > distribution of bytes, and as such most SHA1's by far should have no zero 
> > bytes at all!)
> 
> The probability of a sha1 to have a zero is approximatively 0.075.
> That's 1 in 13, more or less.

Sure. And since we handle it fine even when it does happen, we don't care.

In fact, since we only need 20 non-zero bytes in between zeroes to know 
that it's ok, and since the ASCII part is already 7 bytes of "mode + 
space" plus <n> bytes of actual name (let's say that names average to be 
about 8 characters - which is low: in the kernel it seems to be about 10.5 
characters), we can say that the ASCII part of a tree tends to be about 15 
characters.

So in order to be unlucky, it's not enough for the previous SHA1 to have a 
NUL character in it, it actually has to be in the last five bytes of the 
SHA1 - so now we're talking something like a 1:50 chance.

And with longer names, it matters even less (to the point where it 
matters not at all if all filenames are >= 14 characters in length).

So we can be unlucky, but it's fairly rare, and when it happens, at worst 
we'll just need to scan to the next entry (and if we're *really* unlucky 
and it keeps happening until we scan until the end, we'll have to do the 
linear search).

The point being that you always get the right answer, and the likelihood 
that you have to do something slow to get that rigth answer is really 
really low.

		Linus

  reply	other threads:[~2007-07-19 17:51 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-15 23:19 [PATCH 0/6] Introduce commit notes Johannes Schindelin
2007-07-15 23:22 ` [PATCH 1/6] Rename git_one_line() to git_line_length() and export it Johannes Schindelin
2007-07-15 23:23 ` [PATCH 2/6] Introduce commit notes Johannes Schindelin
2007-07-15 23:36   ` Junio C Hamano
2007-07-15 23:52     ` Johannes Schindelin
2007-07-16  0:05     ` Junio C Hamano
2007-07-16  5:11   ` Junio C Hamano
2007-07-19  2:30     ` [REVISED PATCH " Johannes Schindelin
2007-07-19  3:28       ` Linus Torvalds
2007-07-19  5:13         ` Junio C Hamano
2007-07-19  9:34           ` Junio C Hamano
2007-07-19  9:57             ` Adam Hayek
2007-07-19 10:58             ` Andy Parkins
2007-07-19 11:10               ` Johannes Schindelin
2007-07-19 14:33                 ` Andy Parkins
2007-07-19 17:42             ` Linus Torvalds
2007-07-20  0:20               ` Junio C Hamano
2007-07-20  4:59             ` Shawn O. Pearce
2007-07-19 17:20           ` Linus Torvalds
2007-07-19  9:50         ` Johannes Schindelin
2007-07-19 10:34         ` Olivier Galibert
2007-07-19 17:50           ` Linus Torvalds [this message]
2007-07-19  9:05       ` Wincent Colaiuta
2007-07-19  9:24         ` Johannes Schindelin
2007-07-19  9:54       ` Sven Verdoolaege
2007-07-15 23:23 ` [PATCH 3/6] Add git-notes Johannes Schindelin
2007-07-16  5:11   ` Junio C Hamano
2007-07-19  2:31     ` [REVISED PATCH " Johannes Schindelin
2007-07-19  2:54       ` Johannes Schindelin
2007-07-15 23:24 ` [PATCH 4/6] Add a test script for "git notes" Johannes Schindelin
2007-07-16  5:11   ` Junio C Hamano
2007-07-19  2:32     ` [REVISED PATCH " Johannes Schindelin
2007-07-15 23:24 ` [PATCH 5/6] Document git-notes Johannes Schindelin
2007-07-15 23:26 ` [WIP PATCH 6/6] notes: add notes-index for a substantial speedup Johannes Schindelin
2007-07-15 23:33   ` Johannes Schindelin
2007-07-16  6:01   ` Shawn O. Pearce
2007-07-16 16:29     ` Johannes Schindelin
2007-07-16  7:57 ` [PATCH 0/6] Introduce commit notes Andy Parkins
2007-07-16  8:11   ` Junio C Hamano
2007-07-16 16:26     ` Johannes Schindelin
2007-07-16 17:56       ` Junio C Hamano
2007-07-19  1:34         ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.999.0707191043430.27353@woody.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=albertito@gmail.com \
    --cc=galibert@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).