git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Hash collision count
@ 2005-04-23 20:27 Jeff Garzik
  2005-04-23 20:33 ` Jeff Garzik
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jeff Garzik @ 2005-04-23 20:27 UTC (permalink / raw)
  To: Git Mailing List, Linus Torvalds


Ideally a hash + collision-count pair would make the best key, rather 
than just hash alone.

A collision -will- occur eventually, and it is trivial to avoid this 
problem:

	$n = 0
	attempt to store as $hash-$n
	if $hash-$n exists (unlikely)
		$n++
		goto restart
	key = $hash-$n

Tangent-as-the-reason-I-bring-this-up:

One of my long-term projects is an archive service, somewhat like 
Plan9's venti:  a multi-server key-value database, with sha1 hash as the 
key.

However, as the database grows into the terabyte (possibly petabyte) 
range, the likelihood of a collision transitions rapidly from unlikely 
-> possible -> likely.

Since it is -so- simple to guarantee that you avoid collisions, I'm 
hoping git will do so before the key structure is too ingrained.

	Jeff




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2005-04-25 23:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-23 20:27 Hash collision count Jeff Garzik
2005-04-23 20:33 ` Jeff Garzik
2005-04-23 23:00 ` Ray Heasman
2005-04-23 23:20   ` Jeff Garzik
2005-04-23 23:46     ` Petr Baudis
2005-04-24  0:35       ` Jeff Garzik
2005-04-24  0:40         ` Petr Baudis
2005-04-24  0:43           ` Jeff Garzik
2005-04-24 21:24             ` Imre Simon
2005-04-24 22:25               ` Whales falling on houses - was: " Jon Seymour
2005-04-25 23:50       ` Tom Lord
2005-04-26  0:00         ` Petr Baudis
2005-04-24  1:01     ` Ray Heasman
2005-04-24  7:56 ` David Lang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).