* A couple of questions
@ 2005-04-18 11:51 Imre Simon
2005-04-18 15:31 ` Linus Torvalds
0 siblings, 1 reply; 3+ messages in thread
From: Imre Simon @ 2005-04-18 11:51 UTC (permalink / raw)
To: git
How will git handle a corrupted (git) file system?
For instance, what can be done if objects/xy/z{38} does not pass the
simple consistency test, i.e. if the file's sha1 hash is not xyz{38}?
This might be a serious problem because, in general, one cannot
reconstruct the contents of file objects/xy/z{38} from its name
xyz{38}.
Another problem might come up if the file does pass the simple
consistency test but the file's contents is not a valid git file,
i.e. something that
(*) successfully inflates to a stream of bytes that forms a sequence of
<ascii tag without space> + <space> + <ascii decimal size> +
<byte\0> + <binary object data>.
Are there enough internal redundancies in git to allow fixing at least
some corrupted file systems? Shouldn't there be some?
Another related observation is that git is not really based on a 160 bit
hashing scheme. Indeed, only files that satisfy the above condition
(*) are allowed and this most certainly reduces the valid range of the
hashing function. I do not think that this will be a problem, but it
doesn't hurt to point this out once.
Cheers,
Imre Simon
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: A couple of questions
2005-04-18 11:51 A couple of questions Imre Simon
@ 2005-04-18 15:31 ` Linus Torvalds
2005-04-18 16:23 ` Paul Jackson
0 siblings, 1 reply; 3+ messages in thread
From: Linus Torvalds @ 2005-04-18 15:31 UTC (permalink / raw)
To: Imre Simon; +Cc: git
On Mon, 18 Apr 2005, Imre Simon wrote:
>
> How will git handle a corrupted (git) file system?
>
> For instance, what can be done if objects/xy/z{38} does not pass the
> simple consistency test, i.e. if the file's sha1 hash is not xyz{38}?
> This might be a serious problem because, in general, one cannot
> reconstruct the contents of file objects/xy/z{38} from its name
> xyz{38}.
Nothing beats backups and distribution. The distributed nature of git
means that you can replicate your objects abitrarily.
> Another problem might come up if the file does pass the simple
> consistency test but the file's contents is not a valid git file,
Run "fsck-cache". It not only tests SHA1 and general object sanity, but it
does full tracking of the resulting reachability and everything else. It
prints out any corruption it finds (missing or bad objects), and if you
use the "--unreachable" flag it will also print out objects that exist but
that aren't readable from any of the HEAD nodes (which you need to
specify).
So for example
fsck-cache --unreachable $(cat .git/HEAD)
will do quite a _lot_ of verification on the tree. There are a few extra
validity tests I'm going to add (make sure that tree objects are sorted
properly etc), but on the whole if "fsck-cache" is happy, you do have a
valid tree.
Any corrupt objects you will have to find in backups or other archives (ie
you can just remove them and do an "rsync" with some other site in the
hopes that somebody else has the object you have corrupted).
Of course, "valid tree" doesn't mean that it wasn't generated by some evil
person, and the end result might be crap. Git is a revision tracking
system, not a quality assurance system ;)
Linus
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: A couple of questions
2005-04-18 15:31 ` Linus Torvalds
@ 2005-04-18 16:23 ` Paul Jackson
0 siblings, 0 replies; 3+ messages in thread
From: Paul Jackson @ 2005-04-18 16:23 UTC (permalink / raw)
To: Linus Torvalds; +Cc: is, git
Linus wrote:
> Nothing beats backups and distribution.
Famous quote from the past:
"Only wimps use tape backup: real men just upload their important stuff on ftp,
and let the rest of the world mirror it ;)" Linus Torvalds
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-04-18 16:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-18 11:51 A couple of questions Imre Simon
2005-04-18 15:31 ` Linus Torvalds
2005-04-18 16:23 ` Paul Jackson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).