* Verifying the whole repository
@ 2008-10-23 13:59 Alex Bennee
2008-10-23 14:05 ` David Symonds
2008-10-23 14:28 ` Shawn O. Pearce
0 siblings, 2 replies; 4+ messages in thread
From: Alex Bennee @ 2008-10-23 13:59 UTC (permalink / raw)
To: git
Hi,
While I was debugging a crash in parsecvs while converting our CVS
repository I discovered it was because one of the CVS files had become
corrupted (truncated). This is a problem I've had before with RCS
based files which are prone to silent corruption that you won't notice
until you try and checkout an old revision of the file.
As git is fundamentally hash based it's a lot easier to determine the
health of the repository but I wonder if it's possible for silent
corruption to creep in which won't be noticed until you try and
checkout a historical commit of the tree. I notice there is a
git-verify-pack command that checks the pack files are OK. Do any of
the other commands implicitly ensure all objects in the repo are
correct and valid? git-gc?
Are there any other parts of the .git metadata that are crucial or is
it enough to say if all objects and packs match their hashes you have
all the information you may need to recover an arbitrary revision of
the repo?
--
Alex, homepage: http://www.bennee.com/~alex/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Verifying the whole repository
2008-10-23 13:59 Verifying the whole repository Alex Bennee
@ 2008-10-23 14:05 ` David Symonds
2008-10-23 14:14 ` Alex Bennee
2008-10-23 14:28 ` Shawn O. Pearce
1 sibling, 1 reply; 4+ messages in thread
From: David Symonds @ 2008-10-23 14:05 UTC (permalink / raw)
To: Alex Bennee; +Cc: git
On Thu, Oct 23, 2008 at 6:59 AM, Alex Bennee <kernel-hacker@bennee.com> wrote:
> As git is fundamentally hash based it's a lot easier to determine the
> health of the repository but I wonder if it's possible for silent
> corruption to creep in which won't be noticed until you try and
> checkout a historical commit of the tree. I notice there is a
> git-verify-pack command that checks the pack files are OK. Do any of
> the other commands implicitly ensure all objects in the repo are
> correct and valid? git-gc?
Try: git fsck --full --strict
Dave.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Verifying the whole repository
2008-10-23 13:59 Verifying the whole repository Alex Bennee
2008-10-23 14:05 ` David Symonds
@ 2008-10-23 14:28 ` Shawn O. Pearce
1 sibling, 0 replies; 4+ messages in thread
From: Shawn O. Pearce @ 2008-10-23 14:28 UTC (permalink / raw)
To: Alex Bennee; +Cc: git
Alex Bennee <kernel-hacker@bennee.com> wrote:
> As git is fundamentally hash based it's a lot easier to determine the
> health of the repository but I wonder if it's possible for silent
> corruption to creep in which won't be noticed until you try and
> checkout a historical commit of the tree. I notice there is a
> git-verify-pack command that checks the pack files are OK. Do any of
> the other commands implicitly ensure all objects in the repo are
> correct and valid? git-gc?
As David pointed out, git fsck can be used to verify all of the
hashes, but git-gc also does a quick sanity check using a CRC code
when it copies data from one pack to another pack.
Unlike CVS Git has a write-once, read-many mentality, so with
the exception of git gc (err, actually the git repack it calls)
git never modifies an existing file. That really helps to reduce
the risk of corruption.
If you never do a gc or fsck operation (but still use say commit
or push into the repository) then yes, silent corruption can still
sneak up on you in the form of disk block corruption.
> Are there any other parts of the .git metadata that are crucial or is
> it enough to say if all objects and packs match their hashes you have
> all the information you may need to recover an arbitrary revision of
> the repo?
Don't forget about the loose objects under .git/objects/?? but
otherwise yes, you just need the object data. The refs under
.git/refs are also useful, but the tips can be recovered if the
refs space is lost by "git fsck --unreachable".
--
Shawn.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-10-23 14:29 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-23 13:59 Verifying the whole repository Alex Bennee
2008-10-23 14:05 ` David Symonds
2008-10-23 14:14 ` Alex Bennee
2008-10-23 14:28 ` Shawn O. Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).