* Re: corrupt object on git-gc
@ 2007-11-09 13:38 Yossi Leybovich
2007-11-09 13:46 ` Andreas Ericsson
2007-11-09 16:28 ` Linus Torvalds
0 siblings, 2 replies; 20+ messages in thread
From: Yossi Leybovich @ 2007-11-09 13:38 UTC (permalink / raw)
To: git, ae, Yossi Leybovich
Yossi Leybovich wrote:
>> Hi
>>
>> I know its loose but still I think there are references in the
>> repository to this object.
>> How I can remove it from the repository ?
>>
>That was not a very good idea. You just moved ALL objects whose hash
>begin with 4b out of the object database.
>Try only moving the offending file out of the 4b directory.
Did not help still the repository look for this object?
Any one know how can I track this object and understand which file is it
ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../
ib]$ git-fsck --full
dangling commit 0d43a63623237385e432572bf61171713dcd8e98
dangling commit 4fc6b1127e4a7f4ff5b65a2dd8a90779b5aff3e0
dangling commit 7da607374fe2b1ae09228d2035dd608c73dad7c8
dangling commit 004ef09ae022c60a30f9cd61f90d18df5db3628e
dangling commit 85112c6fabb6b8913ab244a8645d67380616eba6
broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
to blob 4b9458b3786228369c63936db65827de3cc06200
missing blob 4b9458b3786228369c63936db65827de3cc06200
dangling commit bd98481afa93356fa6daa4b6f88c4e631ae2fd72
dangling commit e81e3d2c9c25e5bf5b31327b10b23f9bd0a6d056
dangling commit 92ff9b8cbc771345c9cde0c7fef2c23bb79242b9
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 13:38 corrupt object on git-gc Yossi Leybovich
@ 2007-11-09 13:46 ` Andreas Ericsson
2007-11-09 15:01 ` Yossi Leybovich
2007-11-09 16:28 ` Linus Torvalds
1 sibling, 1 reply; 20+ messages in thread
From: Andreas Ericsson @ 2007-11-09 13:46 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: git, Yossi Leybovich
Yossi Leybovich wrote:
> Yossi Leybovich wrote:
>>> Hi
>>>
>>> I know its loose but still I think there are references in the
>>> repository to this object.
>>> How I can remove it from the repository ?
>>>
>
>> That was not a very good idea. You just moved ALL objects whose hash
>> begin with 4b out of the object database.
>
>> Try only moving the offending file out of the 4b directory.
>
> Did not help still the repository look for this object?
> Any one know how can I track this object and understand which file is it
>
Is this a super-secret project or you can make a tarball of the .git
directory and send it to me? Trying to track down the cause through
email is decidedly slow.
>
>
> ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../
>
> ib]$ git-fsck --full
> dangling commit 0d43a63623237385e432572bf61171713dcd8e98
> dangling commit 4fc6b1127e4a7f4ff5b65a2dd8a90779b5aff3e0
> dangling commit 7da607374fe2b1ae09228d2035dd608c73dad7c8
> dangling commit 004ef09ae022c60a30f9cd61f90d18df5db3628e
> dangling commit 85112c6fabb6b8913ab244a8645d67380616eba6
> broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
> to blob 4b9458b3786228369c63936db65827de3cc06200
One tree uses the object. I'm not sure if any commit-objects
use the tree. Try
for b in $(git branch --no-color -a | cut -b3-); do
for rev in $(git rev-list HEAD); do
git ls-tree -r $rev | grep -q 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
test $? -eq 0 && echo $rev && break
done
done
If it turns up empty, you *should* be able to safely delete
2d9263c6d23595e7cb2a21e5ebbb53655278dff8 and
4b9458b3786228369c63936db65827de3cc06200
Make sure to take a backup first though.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 13:46 ` Andreas Ericsson
@ 2007-11-09 15:01 ` Yossi Leybovich
2007-11-09 15:34 ` Johannes Sixt
0 siblings, 1 reply; 20+ messages in thread
From: Yossi Leybovich @ 2007-11-09 15:01 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: git, Yossi Leybovich
On Nov 9, 2007 8:46 AM, Andreas Ericsson <ae@op5.se> wrote:
>
> Is this a super-secret project or you can make a tarball of the .git
> directory and send it to me? Trying to track down the cause through
> email is decidedly slow.
>
Actually yes , I am not sure I can send the repository , I will
farther check that.
>
> One tree uses the object. I'm not sure if any commit-objects
> use the tree. Try
>
> for b in $(git branch --no-color -a | cut -b3-); do
> for rev in $(git rev-list HEAD); do
> git ls-tree -r $rev | grep -q 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
> test $? -eq 0 && echo $rev && break
> done
> done
tried this and it return empty
[mellanox@mellanox-compile ib]$
[mellanox@mellanox-compile ib]$ for b in $(git branch --no-color -a |
cut -b3-); do
> for rev in $(git rev-list HEAD); do
> git ls-tree -r $rev | grep -q 2d9263c6d23595e7cb2a21e5ebbb53655278dff8;
> test $? -eq 0 && echo $rev && break;
> done; done
[mellanox@mellanox-compile ib]$
[mellanox@mellanox-compile ib]$
[BTW I didn't notice u use the b varieble so I also tried gi rev-list
$b but still empty ]
I also tried to remove object and tree and apperently other trees and
commits reference to these objects
mv ../9458b3786228369c63936db65827de3cc06200 ../4b/
mv: cannot stat `../9458b3786228369c63936db65827de3cc06200': No such
file or directory
[mellanox@mellanox-compile ib]$ mv
.git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../4b/
[mellanox@mellanox-compile ib]$ mv
.git/objects/2d/9263c6d23595e7cb2a21e5ebbb53655278dff8 ../2d/
[mellanox@mellanox-compile ib]$ git-fsck --full
broken link from tree e5a0044c4ccae7635f07414c1f155bac72d25fd9
to tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
dangling commit 0d43a63623237385e432572bf61171713dcd8e98
dangling commit 4fc6b1127e4a7f4ff5b65a2dd8a90779b5aff3e0
dangling commit 7da607374fe2b1ae09228d2035dd608c73dad7c8
dangling commit 004ef09ae022c60a30f9cd61f90d18df5db3628e
broken link from tree 8bd00402b2a20024f4556107b8a729b0205657db
to tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
dangling commit 85112c6fabb6b8913ab244a8645d67380616eba6
missing tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
dangling commit bd98481afa93356fa6daa4b6f88c4e631ae2fd72
dangling commit e81e3d2c9c25e5bf5b31327b10b23f9bd0a6d056
dangling commit 92ff9b8cbc771345c9cde0c7fef2c23bb79242b9
>
> If it turns up empty, you *should* be able to safely delete
> 2d9263c6d23595e7cb2a21e5ebbb53655278dff8 and
> 4b9458b3786228369c63936db65827de3cc06200
>
> Make sure to take a backup first though.
a lot of commits and trees point to this
>
> --
> Andreas Ericsson andreas.ericsson@op5.se
> OP5 AB www.op5.se
> Tel: +46 8-230225 Fax: +46 8-230231
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 15:01 ` Yossi Leybovich
@ 2007-11-09 15:34 ` Johannes Sixt
2007-11-09 15:53 ` Yossi Leybovich
0 siblings, 1 reply; 20+ messages in thread
From: Johannes Sixt @ 2007-11-09 15:34 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: Andreas Ericsson, git, Yossi Leybovich
Yossi Leybovich schrieb:
> [about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
You can try to create a clone (after you have fixed up the artificial
breakages that you made). If that goes well, then the bad object is
referenced only from reflogs.
-- Hannes
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 15:34 ` Johannes Sixt
@ 2007-11-09 15:53 ` Yossi Leybovich
2007-11-09 16:03 ` Johannes Sixt
2007-11-09 16:03 ` Nicolas Pitre
0 siblings, 2 replies; 20+ messages in thread
From: Yossi Leybovich @ 2007-11-09 15:53 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Andreas Ericsson, git, Yossi Leybovich
On Nov 9, 2007 10:34 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Yossi Leybovich schrieb:
> > [about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
>
> You can try to create a clone (after you have fixed up the artificial
> breakages that you made). If that goes well, then the bad object is
> referenced only from reflogs.
>
git clone ib ib-clone
Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
0 blocks
[mellanox@mellanox-compile ib-clone]$ cd ib-clone/
[mellanox@mellanox-compile ib-clone]$ git branch -a
* mlx4
origin/HEAD
origin/master
origin/mlx4
origin/mlx4-work
origin/mthca
origin/second_port
[mellanox@mellanox-compile ib-clone]$ git-gc
Generating pack...
Done counting 3288 objects.
Deltifying 3288 objects...
error: corrupt loose object '4b9458b3786228369c63936db65827de3cc06200'
fatal: object 4b9458b3786228369c63936db65827de3cc06200 cannot be read
error: failed to run repack
So still I cant pack my repository
> -- Hannes
>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 15:53 ` Yossi Leybovich
@ 2007-11-09 16:03 ` Johannes Sixt
2007-11-09 16:03 ` Nicolas Pitre
1 sibling, 0 replies; 20+ messages in thread
From: Johannes Sixt @ 2007-11-09 16:03 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: Andreas Ericsson, git, Yossi Leybovich
Yossi Leybovich schrieb:
> On Nov 9, 2007 10:34 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
>> Yossi Leybovich schrieb:
>>> [about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
>> You can try to create a clone (after you have fixed up the artificial
>> breakages that you made). If that goes well, then the bad object is
>> referenced only from reflogs.
>>
>
>
> git clone ib ib-clone
> Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
> 0 blocks
Make this:
git clone file:///home/mellanox/work/symm/ib ib-clone
otherwise you get a hard-linked identical copy, but you want to use the git
protocol to create the clone.
-- Hannes
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 15:53 ` Yossi Leybovich
2007-11-09 16:03 ` Johannes Sixt
@ 2007-11-09 16:03 ` Nicolas Pitre
2007-11-09 16:31 ` Yossi Leybovich
1 sibling, 1 reply; 20+ messages in thread
From: Nicolas Pitre @ 2007-11-09 16:03 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: Johannes Sixt, Andreas Ericsson, git, Yossi Leybovich
On Fri, 9 Nov 2007, Yossi Leybovich wrote:
> On Nov 9, 2007 10:34 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> > Yossi Leybovich schrieb:
> > > [about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
> >
> > You can try to create a clone (after you have fixed up the artificial
> > breakages that you made). If that goes well, then the bad object is
> > referenced only from reflogs.
> >
>
>
> git clone ib ib-clone
> Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
> 0 blocks
Please try "file://ib" instead. Otherwise the clone will only hardlink
files to the original repository.
Nicolas
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 13:38 corrupt object on git-gc Yossi Leybovich
2007-11-09 13:46 ` Andreas Ericsson
@ 2007-11-09 16:28 ` Linus Torvalds
2007-11-09 17:28 ` [PATCH] add a howto document about corrupted blob recovery Nicolas Pitre
2007-11-09 17:53 ` corrupt object on git-gc Yossi Leybovich
1 sibling, 2 replies; 20+ messages in thread
From: Linus Torvalds @ 2007-11-09 16:28 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: git, ae, Yossi Leybovich
On Fri, 9 Nov 2007, Yossi Leybovich wrote:
>
> Did not help still the repository look for this object?
> Any one know how can I track this object and understand which file is it
So exactly *becuse* the SHA1 hash is cryptographically secure, the hash
itself doesn't actually tell you anything, in order to fix a corrupt
object you basically have to find the "original source" for it.
The easiest way to do that is almost always to have backups, and find the
same object somewhere else. Backups really are a good idea, and git makes
it pretty easy (if nothing else, just clone the repository somewhere else,
and make sure that you do *not* use a hard-linked clone, and preferably
not the same disk/machine).
But since you don't seem to have backups right now, the good news is that
especially with a single blob being corrupt, these things *are* somewhat
debuggable.
First off, move the corrupt object away, and *save* it. The most common
cause of corruption so far has been memory corruption, but even so, there
are people who would be interested in seeing the corruption - but it's
basically impossible to judge the corruption until we can also see the
original object, so right now the corrupt object is useless, but it's very
interesting for the future, in the hope that you can re-create a
non-corrupt version.
So:
> ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../
This is the right thing to do, although it's usually best to save it under
it's full SHA1 name (you just dropped the "4b" from the result ;).
Let's see what that tells us:
> ib]$ git-fsck --full
> broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
> to blob 4b9458b3786228369c63936db65827de3cc06200
> missing blob 4b9458b3786228369c63936db65827de3cc06200
Ok, I removed the "dangling commit" messages, because they are just
messages about the fact that you probably have rebased etc, so they're not
at all interesting. But what remains is still very useful. In particular,
we now know which tree points to it!
Now you can do
git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
which will show something like
100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
100644 blob ee909f2cc49e54f0799a4739d24c4cb9151ae453 CREDITS
040000 tree 0f5f709c17ad89e72bdbbef6ea221c69807009f6 Documentation
100644 blob 1570d248ad9237e4fa6e4d079336b9da62d9ba32 Kbuild
100644 blob 1c7c229a092665b11cd46a25dbd40feeb31661d9 MAINTAINERS
...
and you should now have a line that looks like
10064 blob 4b9458b3786228369c63936db65827de3cc06200 my-magic-file
in the output. This already tells you a *lot* it tells you what file the
corrupt blob came from!
Now, it doesn't tell you quite enough, though: it doesn't tell what
*version* of the file didn't get correctly written! You might be really
lucky, and it may be the version that you already have checked out in your
working tree, in which case fixing this problem is really simple, just do
git hash-object -w my-magic-file
again, and if it outputs the missing SHA1 (4b945..) you're now all done!
But that's the really lucky case, so let's assume that it was some older
version that was broken. How do you tell which version it was?
The easiest way to do it is to do
git log --raw --all --full-history -- subdirectory/my-magic-file
and that will show you the whole log for that file (please realize that
the tree you had may not be the top-level tree, so you need to figure out
which subdirectory it was in on your own), and because you're asking for
raw output, you'll now get something like
commit abc
Author:
Date:
..
:100644 100644 4b9458b... newsha... M somedirectory/my-magic-file
commit xyz
Author:
Date:
..
:100644 100644 oldsha... 4b9458b... M somedirectory/my-magic-file
and this actually tells you what the *previous* and *subsequent* versions
of that file were! So now you can look at those ("oldsha" and "newsha"
respectively), and hopefully you have done commits often, and can
re-create the missing my-magic-file version by looking at those older and
newer versions!
If you can do that, you can now recreate the missing object with
git hash-object -w <recreated-file>
and your repository is good again!
(Btw, you could have ignored the fsck, and started with doing a
git log --raw --all
and just looked for the sha of the missing object (4b9458b..) in that
whole thing. It's up to you - git does *have* a lot of information, it is
just missing one particular blob version.
Trying to recreate trees and especially commits is *much* harder. So you
were lucky that it's a blob. It's quite possible that you can recreate the
thing.
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 16:03 ` Nicolas Pitre
@ 2007-11-09 16:31 ` Yossi Leybovich
2007-11-09 16:52 ` Nicolas Pitre
0 siblings, 1 reply; 20+ messages in thread
From: Yossi Leybovich @ 2007-11-09 16:31 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Johannes Sixt, Andreas Ericsson, git, Yossi Leybovich
On Nov 9, 2007 11:03 AM, Nicolas Pitre <nico@cam.org> wrote:
> On Fri, 9 Nov 2007, Yossi Leybovich wrote:
>
> > On Nov 9, 2007 10:34 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> > > Yossi Leybovich schrieb:
> > > > [about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
> > >
> > > You can try to create a clone (after you have fixed up the artificial
> > > breakages that you made). If that goes well, then the bad object is
> > > referenced only from reflogs.
> > >
> >
> >
> > git clone ib ib-clone
> > Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
> > 0 blocks
>
> Please try "file://ib" instead. Otherwise the clone will only hardlink
> files to the original repository.
>
>
And agian the corruption pop up again , so clone does not help
[mellanox@mellanox-compile ib]$ git clone file://ib ib-clone
Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
remote: Generating pack...
remote: Counting objects: 276
Done counting 3288 objects.
remote: Deltifying 3288 objects...
remote: error: remote: corrupt loose object
'4b9458b3786228369c63936db65827de3cc06200'remote:
remote: fatal: remote: object 4b9458b3786228369c63936db65827de3cc06200
cannot be readremote:
error: git-upload-pack: git-pack-objects died with error.
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: index-pack died with error code 128
fetch-pack from 'file://ib' failed.
fatal: git-upload-pack: aborting due to possible repository corruption
on the remote side.
> Nicolas
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 16:31 ` Yossi Leybovich
@ 2007-11-09 16:52 ` Nicolas Pitre
0 siblings, 0 replies; 20+ messages in thread
From: Nicolas Pitre @ 2007-11-09 16:52 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: Johannes Sixt, Andreas Ericsson, git, Yossi Leybovich
On Fri, 9 Nov 2007, Yossi Leybovich wrote:
> On Nov 9, 2007 11:03 AM, Nicolas Pitre <nico@cam.org> wrote:
> > On Fri, 9 Nov 2007, Yossi Leybovich wrote:
> >
> > > On Nov 9, 2007 10:34 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> > > > Yossi Leybovich schrieb:
> > > > > [about corrupt loose object '4b9458b3786228369c63936db65827de3cc06200']
> > > >
> > > > You can try to create a clone (after you have fixed up the artificial
> > > > breakages that you made). If that goes well, then the bad object is
> > > > referenced only from reflogs.
> > > >
> > >
> > >
> > > git clone ib ib-clone
> > > Initialized empty Git repository in /home/mellanox/work/symm/ib-clone/.git/
> > > 0 blocks
> >
> > Please try "file://ib" instead. Otherwise the clone will only hardlink
> > files to the original repository.
> >
> >
>
> And agian the corruption pop up again , so clone does not help
OK that means that the object is really part of your active history.
Linus just posted a nice summary of your only option left. If you
manage to recreate the damaged object then it would be nice of you if
you could provide us with both the bad and the good one for analysis.
Nicolas
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH] add a howto document about corrupted blob recovery
2007-11-09 16:28 ` Linus Torvalds
@ 2007-11-09 17:28 ` Nicolas Pitre
2007-11-09 17:30 ` Johannes Schindelin
2007-11-26 2:12 ` J. Bruce Fields
2007-11-09 17:53 ` corrupt object on git-gc Yossi Leybovich
1 sibling, 2 replies; 20+ messages in thread
From: Nicolas Pitre @ 2007-11-09 17:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, git
Extracted from a post by Linus on the mailing list.
Signed-off-by: Nicolas Pitre <nico@cam.org>
---
On Fri, 9 Nov 2007, Linus Torvalds wrote:
> But since you don't seem to have backups right now, the good news is that
> especially with a single blob being corrupt, these things *are* somewhat
> debuggable.
I was in the process of writing a similar message, but Linus was quicker
and his version is actually much nicer. Certainly good howto material.
diff --git a/Documentation/howto/recover-corrupted-blob-object.txt b/Documentation/howto/recover-corrupted-blob-object.txt
new file mode 100644
index 0000000..9b6853c
--- /dev/null
+++ b/Documentation/howto/recover-corrupted-blob-object.txt
@@ -0,0 +1,134 @@
+Date: Fri, 9 Nov 2007 08:28:38 -0800 (PST)
+From: Linus Torvalds <torvalds@linux-foundation.org>
+Subject: corrupt object on git-gc
+Abstract: Some tricks to reconstruct blob objects in order to fix
+ a corrupted repository.
+
+On Fri, 9 Nov 2007, Yossi Leybovich wrote:
+>
+> Did not help still the repository look for this object?
+> Any one know how can I track this object and understand which file is it
+
+So exactly *because* the SHA1 hash is cryptographically secure, the hash
+itself doesn't actually tell you anything, in order to fix a corrupt
+object you basically have to find the "original source" for it.
+
+The easiest way to do that is almost always to have backups, and find the
+same object somewhere else. Backups really are a good idea, and git makes
+it pretty easy (if nothing else, just clone the repository somewhere else,
+and make sure that you do *not* use a hard-linked clone, and preferably
+not the same disk/machine).
+
+But since you don't seem to have backups right now, the good news is that
+especially with a single blob being corrupt, these things *are* somewhat
+debuggable.
+
+First off, move the corrupt object away, and *save* it. The most common
+cause of corruption so far has been memory corruption, but even so, there
+are people who would be interested in seeing the corruption - but it's
+basically impossible to judge the corruption until we can also see the
+original object, so right now the corrupt object is useless, but it's very
+interesting for the future, in the hope that you can re-create a
+non-corrupt version.
+
+So:
+
+> ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../
+
+This is the right thing to do, although it's usually best to save it under
+it's full SHA1 name (you just dropped the "4b" from the result ;).
+
+Let's see what that tells us:
+
+> ib]$ git-fsck --full
+> broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+> to blob 4b9458b3786228369c63936db65827de3cc06200
+> missing blob 4b9458b3786228369c63936db65827de3cc06200
+
+Ok, I removed the "dangling commit" messages, because they are just
+messages about the fact that you probably have rebased etc, so they're not
+at all interesting. But what remains is still very useful. In particular,
+we now know which tree points to it!
+
+Now you can do
+
+ git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+
+which will show something like
+
+ 100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
+ 100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
+ 100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
+ 100644 blob ee909f2cc49e54f0799a4739d24c4cb9151ae453 CREDITS
+ 040000 tree 0f5f709c17ad89e72bdbbef6ea221c69807009f6 Documentation
+ 100644 blob 1570d248ad9237e4fa6e4d079336b9da62d9ba32 Kbuild
+ 100644 blob 1c7c229a092665b11cd46a25dbd40feeb31661d9 MAINTAINERS
+ ...
+
+and you should now have a line that looks like
+
+ 10064 blob 4b9458b3786228369c63936db65827de3cc06200 my-magic-file
+
+in the output. This already tells you a *lot* it tells you what file the
+corrupt blob came from!
+
+Now, it doesn't tell you quite enough, though: it doesn't tell what
+*version* of the file didn't get correctly written! You might be really
+lucky, and it may be the version that you already have checked out in your
+working tree, in which case fixing this problem is really simple, just do
+
+ git hash-object -w my-magic-file
+
+again, and if it outputs the missing SHA1 (4b945..) you're now all done!
+
+But that's the really lucky case, so let's assume that it was some older
+version that was broken. How do you tell which version it was?
+
+The easiest way to do it is to do
+
+ git log --raw --all --full-history -- subdirectory/my-magic-file
+
+and that will show you the whole log for that file (please realize that
+the tree you had may not be the top-level tree, so you need to figure out
+which subdirectory it was in on your own), and because you're asking for
+raw output, you'll now get something like
+
+ commit abc
+ Author:
+ Date:
+ ..
+ :100644 100644 4b9458b... newsha... M somedirectory/my-magic-file
+
+
+ commit xyz
+ Author:
+ Date:
+
+ ..
+ :100644 100644 oldsha... 4b9458b... M somedirectory/my-magic-file
+
+and this actually tells you what the *previous* and *subsequent* versions
+of that file were! So now you can look at those ("oldsha" and "newsha"
+respectively), and hopefully you have done commits often, and can
+re-create the missing my-magic-file version by looking at those older and
+newer versions!
+
+If you can do that, you can now recreate the missing object with
+
+ git hash-object -w <recreated-file>
+
+and your repository is good again!
+
+(Btw, you could have ignored the fsck, and started with doing a
+
+ git log --raw --all
+
+and just looked for the sha of the missing object (4b9458b..) in that
+whole thing. It's up to you - git does *have* a lot of information, it is
+just missing one particular blob version.
+
+Trying to recreate trees and especially commits is *much* harder. So you
+were lucky that it's a blob. It's quite possible that you can recreate the
+thing.
+
+ Linus
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] add a howto document about corrupted blob recovery
2007-11-09 17:28 ` [PATCH] add a howto document about corrupted blob recovery Nicolas Pitre
@ 2007-11-09 17:30 ` Johannes Schindelin
2007-11-26 2:12 ` J. Bruce Fields
1 sibling, 0 replies; 20+ messages in thread
From: Johannes Schindelin @ 2007-11-09 17:30 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, Linus Torvalds, git
Hi,
On Fri, 9 Nov 2007, Nicolas Pitre wrote:
> Extracted from a post by Linus on the mailing list.
Heh. I was hoping that somebody would be quicker than me...
Ciao,
Dscho
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 16:28 ` Linus Torvalds
2007-11-09 17:28 ` [PATCH] add a howto document about corrupted blob recovery Nicolas Pitre
@ 2007-11-09 17:53 ` Yossi Leybovich
2007-11-09 18:02 ` Linus Torvalds
1 sibling, 1 reply; 20+ messages in thread
From: Yossi Leybovich @ 2007-11-09 17:53 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git, ae, Yossi Leybovich
On Nov 9, 2007 11:28 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> and you should now have a line that looks like
>
> 10064 blob 4b9458b3786228369c63936db65827de3cc06200 my-magic-file
That works and now I know the file
>
> The easiest way to do it is to do
>
> git log --raw --all --full-history -- subdirectory/my-magic-file
>
> and that will show you the whole log for that file (please realize that
> the tree you had may not be the top-level tree, so you need to figure out
> which subdirectory it was in on your own), and because you're asking for
> raw output, you'll now get something like
>
> commit abc
> Author:
> Date:
> ..
> :100644 100644 4b9458b... newsha... M somedirectory/my-magic-file
>
>
> commit xyz
> Author:
> Date:
>
> ..
> :100644 100644 oldsha... 4b9458b... M somedirectory/my-magic-file
>
> and this actually tells you what the *previous* and *subsequent* versions
> of that file were! So now you can look at those ("oldsha" and "newsha"
> respectively), and hopefully you have done commits often, and can
> re-create the missing my-magic-file version by looking at those older and
> newer versions!
>
> If you can do that, you can now recreate the missing object with
Ok, tried that and unfortuantly the SHA1 number is apear only one
[mellanox@mellanox-compile ib]$ git log --raw --all --full-history --
SymmK/St.c | grep 4b9
:100755 100755 308806c... 4b9458b3786228369c63936db65827de3cc06200 M
SymmK/St.c
git log --raw --all --full-history -- SymmK/St.c
...
...
commit 597e70e7dc8e06a7cdbe4d9e9727411c964bd023
Author: sleybo <sleybo@mellanox.co.il>
Date: Fri Oct 5 10:41:43 2007 -0400
1. increase QPs parameters - QP is bigger than 4k
2. lock buffers use the dma key
3. add prints
:100755 100755 308806c... 4b9458b3786228369c63936db65827de3cc06200 M
SymmK/St.c
What intersting is that the SHA1 that I looked for apear only once
(only as new SHA1)
So I checkout version of the file which produce the old SHA1 308806c....
[mellanox@mellanox-compile ib-tmp]$ git checkout mlx4-start -- SymmK/St.c
[mellanox@mellanox-compile ib-tmp]$ git hash-object -w SymmK/St.c
308806cf3a864656a49d00edc35b9505abd627a2
than I did
[mellanox@mellanox-compile ib-tmp]$ git diff-tree --stdin -p --pretty
597e70e7dc8e06a7cdbe4d9e9727411c964bd023 > commit-597e70e
( which is the commit SHA1)
[mellanox@mellanox-compile ib-tmp]$ git apply commit-597e70e
Adds trailing whitespace.
../ib/commit-597e70e:1622:
Adds trailing whitespace.
../ib/commit-597e70e:1646: (int)devif->lock_dma +
lockid*sizeof(u64),
warning: 2 lines add whitespace errors.
[mellanox@mellanox-compile ib-tmp]$ git hash-object -w SymmK/St.c
e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
So the same commit actual lead to the wrong SHA1
(I tried this flow on different file and it works)
I think I am close but still not there , any suggestions ?
Thanks
Yossi
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 17:53 ` corrupt object on git-gc Yossi Leybovich
@ 2007-11-09 18:02 ` Linus Torvalds
2007-11-09 18:37 ` Yossi Leybovich
0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2007-11-09 18:02 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: git, ae, Yossi Leybovich
On Fri, 9 Nov 2007, Yossi Leybovich wrote:
>
> Ok, tried that and unfortuantly the SHA1 number is apear only one
>
> [mellanox@mellanox-compile ib]$ git log --raw --all --full-history --
> SymmK/St.c | grep 4b9
> :100755 100755 308806c... 4b9458b3786228369c63936db65827de3cc06200 M SymmK/St.c
Actually, that's not at all "unfortunately", because that implies that
it's the very *latest* version of that "SymmK/St.c" file. I really think
you already had it checked out, but didn't try my first suggestion of just
doing "git hash-object -w SymmK/St.c" which likely would have fixed it
already (unless you had changed it in your working tree, of course!)
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 18:02 ` Linus Torvalds
@ 2007-11-09 18:37 ` Yossi Leybovich
2007-11-09 18:55 ` Linus Torvalds
0 siblings, 1 reply; 20+ messages in thread
From: Yossi Leybovich @ 2007-11-09 18:37 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git, ae, Yossi Leybovich
On Nov 9, 2007 1:02 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Fri, 9 Nov 2007, Yossi Leybovich wrote:
> >
> > Ok, tried that and unfortuantly the SHA1 number is apear only one
> >
> > [mellanox@mellanox-compile ib]$ git log --raw --all --full-history --
> > SymmK/St.c | grep 4b9
> > :100755 100755 308806c... 4b9458b3786228369c63936db65827de3cc06200 M SymmK/St.c
>
> Actually, that's not at all "unfortunately", because that implies that
> it's the very *latest* version of that "SymmK/St.c" file. I really think
> you already had it checked out, but didn't try my first suggestion of just
> doing "git hash-object -w SymmK/St.c" which likely would have fixed it
> already (unless you had changed it in your working tree, of course!)
>
Its very old version of the file.
What interesting is the second part of the experiment
I tried to apply the same commit on this file and it leaded to different SHA1
> Linus
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 18:37 ` Yossi Leybovich
@ 2007-11-09 18:55 ` Linus Torvalds
2007-11-09 19:07 ` Mike Hommey
0 siblings, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2007-11-09 18:55 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: git, ae, Yossi Leybovich
On Fri, 9 Nov 2007, Yossi Leybovich wrote:
>
> What interesting is the second part of the experiment
> I tried to apply the same commit on this file and it leaded to different SHA1
Eh. That commit was basically corrupt, because the blob had gotten
removed. I don't even understand how git diff-tree gave a diff with that
file at all (side note: I'd also suggest you just use "git show <commit>"
instead of that complex and _really_ old git-diff-tree incantation).
So no, you didn't "apply the same commit".
But if you have the diff somewhere (perhaps email archive? you sent it to
somebody?) or you can re-create it exactly, then..
Linus
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 18:55 ` Linus Torvalds
@ 2007-11-09 19:07 ` Mike Hommey
2007-11-09 19:41 ` Yossi Leybovich
0 siblings, 1 reply; 20+ messages in thread
From: Mike Hommey @ 2007-11-09 19:07 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Yossi Leybovich, git, ae, Yossi Leybovich
On Fri, Nov 09, 2007 at 10:55:03AM -0800, Linus Torvalds wrote:
>
>
> On Fri, 9 Nov 2007, Yossi Leybovich wrote:
> >
> > What interesting is the second part of the experiment
> > I tried to apply the same commit on this file and it leaded to different SHA1
>
> Eh. That commit was basically corrupt, because the blob had gotten
> removed. I don't even understand how git diff-tree gave a diff with that
> file at all (side note: I'd also suggest you just use "git show <commit>"
> instead of that complex and _really_ old git-diff-tree incantation).
>
> So no, you didn't "apply the same commit".
>
> But if you have the diff somewhere (perhaps email archive? you sent it to
> somebody?) or you can re-create it exactly, then..
Or maybe just from memory, by looking at the diff between the previous version
and the next version of the file.
Mike
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 19:07 ` Mike Hommey
@ 2007-11-09 19:41 ` Yossi Leybovich
2007-11-09 19:52 ` Mike Hommey
0 siblings, 1 reply; 20+ messages in thread
From: Yossi Leybovich @ 2007-11-09 19:41 UTC (permalink / raw)
To: Mike Hommey; +Cc: Linus Torvalds, git, ae, Yossi Leybovich
What I do notice is that this commit involve few files. most of the
file the commit generate the right next SHA1
only for one file its generate broken SHA1
>From the git show <commit> I can see that the file which end up
corrupted is actually being totaly remove from
diff --git a/SymmK/St.c b/SymmK/St.c
index 308806c..4b9458b 100755
--- a/SymmK/St.c
+++ b/SymmK/St.c
@@ -1,1535 +0,0 @@
-MODULE_ALIAS(m_st);
-
-#include <errno.h>
-#include <string.h>
-#include <stdarg.h>
-#include <sys/types.h>
-#include <sys/time.h>
-#include "ib_global_init.h"
....
.....
....
While I tried to delete the whole file and I did not get the right SHA1
Is this soud familiar to some one ?
maybe its releated to issue with some kind of white character I cant see.
Yossi
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: corrupt object on git-gc
2007-11-09 19:41 ` Yossi Leybovich
@ 2007-11-09 19:52 ` Mike Hommey
0 siblings, 0 replies; 20+ messages in thread
From: Mike Hommey @ 2007-11-09 19:52 UTC (permalink / raw)
To: Yossi Leybovich; +Cc: Linus Torvalds, git, ae, Yossi Leybovich
On Fri, Nov 09, 2007 at 02:41:05PM -0500, Yossi Leybovich wrote:
> What I do notice is that this commit involve few files. most of the
> file the commit generate the right next SHA1
> only for one file its generate broken SHA1
>
> From the git show <commit> I can see that the file which end up
> corrupted is actually being totaly remove from
>
> diff --git a/SymmK/St.c b/SymmK/St.c
> index 308806c..4b9458b 100755
> --- a/SymmK/St.c
> +++ b/SymmK/St.c
> @@ -1,1535 +0,0 @@
> -MODULE_ALIAS(m_st);
> -
> -#include <errno.h>
> -#include <string.h>
> -#include <stdarg.h>
> -#include <sys/types.h>
> -#include <sys/time.h>
> -#include "ib_global_init.h"
> ....
> .....
> ....
>
>
> While I tried to delete the whole file and I did not get the right SHA1
> Is this soud familiar to some one ?
> maybe its releated to issue with some kind of white character I cant see.
Because the blob is corrupted, git show can't display the correct diff.
You have to guess it by yourself ! The best you can do is look at the
diff for this file between its previous version and the one just after
the corrupted version.
Mike
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] add a howto document about corrupted blob recovery
2007-11-09 17:28 ` [PATCH] add a howto document about corrupted blob recovery Nicolas Pitre
2007-11-09 17:30 ` Johannes Schindelin
@ 2007-11-26 2:12 ` J. Bruce Fields
1 sibling, 0 replies; 20+ messages in thread
From: J. Bruce Fields @ 2007-11-26 2:12 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Junio C Hamano, Linus Torvalds, git
On Fri, Nov 09, 2007 at 12:28:19PM -0500, Nicolas Pitre wrote:
> Extracted from a post by Linus on the mailing list.
>
> Signed-off-by: Nicolas Pitre <nico@cam.org>
I rearranged this some more and added it to the manual, assuming that
makes sense to everyone.
I think there needs to be some discussion of pack objects and stuff too
some day. I added a few mail archive references to the "todo" section.
--b.
commit d6e199cb6ff911e8e3e39c8b7021512a14ea79a5
Author: J. Bruce Fields <bfields@citi.umich.edu>
Date: Sat Mar 3 22:53:37 2007 -0500
user-manual: recovering from corruption
Some instructions on dealing with corruption of the object database.
Most of this text is from an example by Linus, identified by Nicolas
Pitre <nico@cam.org> with a little further editing by me.
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index c027353..3166fb6 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -1554,6 +1554,11 @@ This may be time-consuming. Unlike most other git operations (including
git-gc when run without any options), it is not safe to prune while
other git operations are in progress in the same repository.
+If gitlink:git-fsck[1] complains about sha1 mismatches or missing
+objects, you may have a much more serious problem; your best option is
+probably restoring from backups. See
+<<recovering-from-repository-corruption>> for a detailed discussion.
+
[[recovering-lost-changes]]
Recovering lost changes
~~~~~~~~~~~~~~~~~~~~~~~
@@ -3172,6 +3177,127 @@ confusing and scary messages, but it won't actually do anything bad. In
contrast, running "git prune" while somebody is actively changing the
repository is a *BAD* idea).
+[[recovering-from-repository-corruption]]
+Recovering from repository corruption
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By design, git treats data trusted to it with caution. However, even in
+the absence of bugs in git itself, it is still possible that hardware or
+operating system errors could corrupt data.
+
+The first defense against such problems is backups. You can back up a
+git directory using clone, or just using cp, tar, or any other backup
+mechanism.
+
+As a last resort, you can search for the corrupted objects and attempt
+to replace them by hand. Back up your repository before attempting this
+in case you corrupt things even more in the process.
+
+We'll assume that the problem is a single missing or corrupted blob,
+which is sometimes a solveable problem. (Recovering missing trees and
+especially commits is *much* harder).
+
+Before starting, verify that there is corruption, and figure out where
+it is with gitlink:git-fsck[1]; this may be time-consuming.
+
+Assume the output looks like this:
+
+------------------------------------------------
+$ git-fsck --full
+broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+ to blob 4b9458b3786228369c63936db65827de3cc06200
+missing blob 4b9458b3786228369c63936db65827de3cc06200
+------------------------------------------------
+
+(Typically there will be some "dangling object" messages too, but they
+aren't interesting.)
+
+Now you know that blob 4b9458b3 is missing, and that the tree 2d9263c6
+points to it. If you could find just one copy of that missing blob
+object, possibly in some other repository, you could move it into
+.git/objects/4b/9458b3... and be done. Suppose you can't. You can
+still examine the tree that pointed to it with gitlink:git-ls-tree[1],
+which might output something like:
+
+------------------------------------------------
+$ git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
+100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
+100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
+100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
+...
+100644 blob 4b9458b3786228369c63936db65827de3cc06200 myfile
+...
+------------------------------------------------
+
+So now you know that the missing blob was the data for a file named
+"myfile". And chances are you can also identify the directory--let's
+say it's in "somedirectory". If you're lucky the missing copy might be
+the same as the copy you have checked out in your working tree at
+"somedirectory/myfile"; you can test whether that's right with
+gitlink:git-hash-object[1]:
+
+------------------------------------------------
+$ git hash-object -w somedirectory/myfile
+------------------------------------------------
+
+which will create and store a blob object with the contents of
+somedirectory/myfile, and output the sha1 of that object. if you're
+extremely lucky it might be 4b9458b3786228369c63936db65827de3cc06200, in
+which case you've guessed right, and the corruption is fixed!
+
+Otherwise, you need more information. How do you tell which version of
+the file has been lost?
+
+The easiest way to do this is with:
+
+------------------------------------------------
+$ git log --raw --all --full-history -- somedirectory/myfile
+------------------------------------------------
+
+Because you're asking for raw output, you'll now get something like
+
+------------------------------------------------
+commit abc
+Author:
+Date:
+...
+:100644 100644 4b9458b... newsha... M somedirectory/myfile
+
+
+commit xyz
+Author:
+Date:
+
+...
+:100644 100644 oldsha... 4b9458b... M somedirectory/myfile
+------------------------------------------------
+
+This tells you that the immediately preceding version of the file was
+"newsha", and that the immediately following version was "oldsha".
+You also know the commit messages that went with the change from oldsha
+to 4b9458b and with the change from 4b9458b to newsha.
+
+If you've been committing small enough changes, you may now have a good
+shot at reconstructing the contents of the in-between state 4b9458b.
+
+If you can do that, you can now recreate the missing object with
+
+------------------------------------------------
+$ git hash-object -w <recreated-file>
+------------------------------------------------
+
+and your repository is good again!
+
+(Btw, you could have ignored the fsck, and started with doing a
+
+------------------------------------------------
+$ git log --raw --all
+------------------------------------------------
+
+and just looked for the sha of the missing object (4b9458b..) in that
+whole thing. It's up to you - git does *have* a lot of information, it is
+just missing one particular blob version.
+
[[the-index]]
The index
-----------
@@ -4381,4 +4507,7 @@ Write a chapter on using plumbing and writing scripts.
Alternates, clone -reference, etc.
-git unpack-objects -r for recovery
+More on recovery from repository corruption. See:
+ http://marc.theaimsgroup.com/?l=git&m=117263864820799&w=2
+ http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
+ http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2007-11-26 2:13 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-09 13:38 corrupt object on git-gc Yossi Leybovich
2007-11-09 13:46 ` Andreas Ericsson
2007-11-09 15:01 ` Yossi Leybovich
2007-11-09 15:34 ` Johannes Sixt
2007-11-09 15:53 ` Yossi Leybovich
2007-11-09 16:03 ` Johannes Sixt
2007-11-09 16:03 ` Nicolas Pitre
2007-11-09 16:31 ` Yossi Leybovich
2007-11-09 16:52 ` Nicolas Pitre
2007-11-09 16:28 ` Linus Torvalds
2007-11-09 17:28 ` [PATCH] add a howto document about corrupted blob recovery Nicolas Pitre
2007-11-09 17:30 ` Johannes Schindelin
2007-11-26 2:12 ` J. Bruce Fields
2007-11-09 17:53 ` corrupt object on git-gc Yossi Leybovich
2007-11-09 18:02 ` Linus Torvalds
2007-11-09 18:37 ` Yossi Leybovich
2007-11-09 18:55 ` Linus Torvalds
2007-11-09 19:07 ` Mike Hommey
2007-11-09 19:41 ` Yossi Leybovich
2007-11-09 19:52 ` Mike Hommey
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).