* [HELP] Corrupted repository
@ 2013-06-21 10:49 Ramkumar Ramachandra
2013-06-21 16:22 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Ramkumar Ramachandra @ 2013-06-21 10:49 UTC (permalink / raw)
To: Git List
Hi,
Until now, my interest in corrupted repositories has been very
limited. Just now, the power went out for a second and my UPS failed
me. As a result, my ~/src/git is completely borked. For your
amusement, here's a quick session showing me bumbling around:
$ ~/src/git
error: object file
.git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
error: object file
.git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
fatal: loose object 8e6a6dda24b017915449897fcc1353a9b848fd2f (stored
in .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f) is corrupt
artagnon|remote-cruft*+:~/src/git$ rm
.git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f
artagnon|remote-cruft*+:~/src/git$ git prune
artagnon|remote-cruft*+:~/src/git$ git status
fatal: bad object HEAD
fatal: bad object HEAD
artagnon|remote-cruft*+:~/src/git$ git symbolic-ref HEAD refs/heads/master
artagnon|master*+=:~/src/git$ git status
## master
MM Documentation/git-ls-remote.txt
MM remote.c
MM t/t5505-remote.sh
MM t/t5510-fetch.sh
MM t/t5515-fetch-merge-logic.sh
MM t/t5516-fetch-push.sh
?? lib/
?? outgoing/
That status is completely bogus, by the way.
artagnon|master*+=:~/src/git$ git reset --hard
artagnon|master*+=:~/src/git$ git checkout remote-cruft
fatal: reference is not a tree: remote-cruft
artagnon|master=:~/src/git$ git reflog
21ff915 HEAD@{10 minutes ago}: rebase -i (finish): returning to
refs/heads/remote-cruft
What happened to the rest of my reflog?! Okay, I give up. Let's go
back to what's present on Github. I push often, so it's not a
problem.
artagnon|master=:~/src/git$ git branch -D remote-cruft
error: Couldn't look up commit object for 'refs/heads/remote-cruft'
artagnon|master=:~/src/git$ rm .git/refs/heads/remote-cruft
artagnon|master=:~/src/git$ git checkout -b remote-cruft
Switched to a new branch 'remote-cruft'
Huh? What happened to my upstream?
artagnon|remote-cruft:~/src/git$ git branch -u ram/remote-cruft
warning: ignoring broken ref refs/remotes/ram/remote-cruft.
Fine, let's fetch.
artagnon|remote-cruft:~/src/git$ git fetch ram
remote: Counting objects: 101, done.
remote: Compressing objects: 100% (24/24), done.
remote: Total 92 (delta 78), reused 82 (delta 68)
error: object file
.git/objects/08/2b069c11e8d4f372b963b038cbf5b71a676ef6 is empty
fatal: loose object 082b069c11e8d4f372b963b038cbf5b71a676ef6 (stored
in .git/objects/08/2b069c11e8d4f372b963b038cbf5b71a676ef6) is corrupt
fatal: unpack-objects failed
Fine, let's run an fsck and get rid of all the corrupted objects.
$ git fsck
error: object file
.git/objects/08/2b069c11e8d4f372b963b038cbf5b71a676ef6 is empty
error: object file
.git/objects/08/2b069c11e8d4f372b963b038cbf5b71a676ef6 is empty
fatal: loose object 082b069c11e8d4f372b963b038cbf5b71a676ef6 (stored
in .git/objects/08/2b069c11e8d4f372b963b038cbf5b71a676ef6) is corrupt
artagnon|remote-cruft:~/src/git$ rm
.git/objects/08/2b069c11e8d4f372b963b038cbf5b71a676ef6
artagnon|remote-cruft:~/src/git$ git repack
artagnon|remote-cruft:~/src/git$ git fetch ram
remote: Counting objects: 101, done.
remote: Compressing objects: 100% (24/24), done.
remote: Total 92 (delta 78), reused 82 (delta 68)
Unpacking objects: 100% (92/92), done.
error: object file
.git/objects/64/fa33d706658278b871a6e2ca66694efcadacca is empty
fatal: loose object 64fa33d706658278b871a6e2ca66694efcadacca (stored
in .git/objects/64/fa33d706658278b871a6e2ca66694efcadacca) is corrupt
error: github.com:artagnon/git did not send all necessary objects
Fine, my packfiles are corrupt. Let's unpack-objects by hand.
artagnon|remote-cruft:~/src/git$ mv .git/objects/pack .git/objects/pack.old
artagnon|remote-cruft+:~/src/git$ for i in
.git/objects/pack.old/*.pack; do git unpack-objects -r <$i; done
artagnon|remote-cruft:~/src/git$ git fetch ram
remote: Counting objects: 101, done.
remote: Compressing objects: 100% (24/24), done.
remote: Total 92 (delta 78), reused 82 (delta 68)
Unpacking objects: 100% (92/92), done.
error: object file
.git/objects/64/fa33d706658278b871a6e2ca66694efcadacca is empty
fatal: loose object 64fa33d706658278b871a6e2ca66694efcadacca (stored
in .git/objects/64/fa33d706658278b871a6e2ca66694efcadacca) is corrupt
error: github.com:artagnon/git did not send all necessary objects
Auto packing the repository for optimum performance. You may also
run "git gc" manually. See "git help gc" for more information.
error: bad ref for refs/remotes/ram/remote-cruft
error: bad ref for refs/remotes/ram/remote-cruft
Counting objects: 161917, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (159963/159963), done.
Writing objects: 100% (161917/161917), done.
Total 161917 (delta 117725), reused 0 (delta 0)
Removing duplicate objects: 100% (256/256), done.
error: bad ref for refs/remotes/ram/remote-cruft
Checking connectivity: 161917, done.
warning: There are too many unreachable loose objects; run 'git
prune' to remove them.
I'm assuming it just went back and fetched everything the second time.
Why didn't it do that in the first place?
artagnon|remote-cruft:~/src/git$ git log ram/remote-cruft
warning: ignoring broken ref refs/remotes/ram/remote-cruft.
Now what? Why didn't the fetch update this ref?
artagnon|remote-cruft:~/src/git$ rm .git/refs/remotes/ram/remote-cruft
artagnon|remote-cruft:~/src/git$ git fetach ram
remote: Counting objects: 101, done.
remote: Compressing objects: 100% (24/24), done.
remote: Total 92 (delta 78), reused 82 (delta 68)
Unpacking objects: 100% (92/92), done.
From github.com:artagnon/git
* [new branch] remote-cruft -> ram/remote-cruft
* [new branch] upstream-fix -> ram/upstream-fix
Yes! Everything finally works.
Was I being stupid, or is fixing corrupted repositories really this
non-trivial? Comments appreciated.
Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [HELP] Corrupted repository
2013-06-21 10:49 [HELP] Corrupted repository Ramkumar Ramachandra
@ 2013-06-21 16:22 ` Junio C Hamano
2013-06-21 16:44 ` Ramkumar Ramachandra
0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2013-06-21 16:22 UTC (permalink / raw)
To: Ramkumar Ramachandra; +Cc: Git List
Ramkumar Ramachandra <artagnon@gmail.com> writes:
> $ ~/src/git
> error: object file
> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
> error: object file
> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
> fatal: loose object 8e6a6dda24b017915449897fcc1353a9b848fd2f (stored
> in .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f) is corrupt
So fsync() and close() thought that the filesystem stored this loose
object safely, but it turns out that the data is not on disk.
> artagnon|remote-cruft*+:~/src/git$ rm
> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f
As you know it is empty, removing (as long as you do not forget what
the object name was, which may later become useful when untangling
the mess further) does not hurt very much.
> artagnon|remote-cruft*+:~/src/git$ git prune
> artagnon|remote-cruft*+:~/src/git$ git status
> fatal: bad object HEAD
> fatal: bad object HEAD
And the value in the HEAD was???
> artagnon|remote-cruft*+:~/src/git$ git symbolic-ref HEAD refs/heads/master
> artagnon|master*+=:~/src/git$ git status
> ## master
> MM Documentation/git-ls-remote.txt
> MM remote.c
> MM t/t5505-remote.sh
> MM t/t5510-fetch.sh
> MM t/t5515-fetch-merge-logic.sh
> MM t/t5516-fetch-push.sh
> ?? lib/
> ?? outgoing/
>
> That status is completely bogus, by the way.
... which may suggest that your index file may have been corrupted
on the filesystem.
> artagnon|master*+=:~/src/git$ git reset --hard
... and using that known-to-be-corrupt index, the working tree state
is discarded.
> artagnon|master*+=:~/src/git$ git checkout remote-cruft
> fatal: reference is not a tree: remote-cruft
> artagnon|master=:~/src/git$ git reflog
> 21ff915 HEAD@{10 minutes ago}: rebase -i (finish): returning to
> refs/heads/remote-cruft
>
> What happened to the rest of my reflog?!
On the filesystem known to not record the last consistent state of
the repository, the answer to that question may be rather obvious,
no?
> artagnon|master=:~/src/git$ git branch -D remote-cruft
> error: Couldn't look up commit object for 'refs/heads/remote-cruft'
The command would want to report what was at the tip, so it is
understandable it may want to look up that commit before removing
the ref.
> artagnon|master=:~/src/git$ rm .git/refs/heads/remote-cruft
> artagnon|master=:~/src/git$ git checkout -b remote-cruft
> Switched to a new branch 'remote-cruft'
>
> Huh? What happened to my upstream?
>
> artagnon|remote-cruft:~/src/git$ git branch -u ram/remote-cruft
> warning: ignoring broken ref refs/remotes/ram/remote-cruft.
So remotes/ram/remote-cruft is also broken.
> Fine, let's fetch.
Why?
"fetch" walks the ancestry graph on both ends to minimize transfers.
It's not something you would expect to work when you know refs at
your end does not even record what you do have. It _may_ appear to
work if your refs are intact but you are missing objects, as they
will not be transferred again from the good copy if you let your
repository's ref claim that you have _all_ objects behind it when
you actually don't.
What would have been a better starting point to untangle is to make
a separate clone, pretending as if this repository did not even
exist, and copy the resulting packfile into this repository. That
would at least give you a known good copies of objects that you
already have pushed out.
And the next step would have been (without doing any of the above
"remove this branch, recreate this one anew") to compare the tips
of refs in this broken repository and the clone. The same ones you
can trust, and different ones you dig further.
> Was I being stupid, or is fixing corrupted repositories really this
> non-trivial? Comments appreciated.
I think "Let's fetch first" was the step that took you in a wrong
direction that requires unnecessary work.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [HELP] Corrupted repository
2013-06-21 16:22 ` Junio C Hamano
@ 2013-06-21 16:44 ` Ramkumar Ramachandra
2013-06-21 19:00 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Ramkumar Ramachandra @ 2013-06-21 16:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git List
Junio C Hamano wrote:
>> $ ~/src/git
>> error: object file
>> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
>> error: object file
>> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
>> fatal: loose object 8e6a6dda24b017915449897fcc1353a9b848fd2f (stored
>> in .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f) is corrupt
>
> So fsync() and close() thought that the filesystem stored this loose
> object safely, but it turns out that the data is not on disk.
Where should I start digging if I want to fix this? Actually you just
need to tell me how to build reduced-case corruptions to test: I can
trace and figure out the rest.
>> artagnon|remote-cruft*+:~/src/git$ git prune
>> artagnon|remote-cruft*+:~/src/git$ git status
>> fatal: bad object HEAD
>> fatal: bad object HEAD
>
> And the value in the HEAD was???
ref: refs/heads/remote-cruft. That's why I included my prompt :)
>> artagnon|remote-cruft*+:~/src/git$ git symbolic-ref HEAD refs/heads/master
>> artagnon|master*+=:~/src/git$ git status
>> ## master
>> MM Documentation/git-ls-remote.txt
>> MM remote.c
>> MM t/t5505-remote.sh
>> MM t/t5510-fetch.sh
>> MM t/t5515-fetch-merge-logic.sh
>> MM t/t5516-fetch-push.sh
>> ?? lib/
>> ?? outgoing/
>>
>> That status is completely bogus, by the way.
>
> ... which may suggest that your index file may have been corrupted
> on the filesystem.
Yeah, my question pertains to why is the index half-corrupted. Is
there no checksum to say "index corrupted; do not display bogus
nonsense"?
>> artagnon|master*+=:~/src/git$ git checkout remote-cruft
>> fatal: reference is not a tree: remote-cruft
>> artagnon|master=:~/src/git$ git reflog
>> 21ff915 HEAD@{10 minutes ago}: rebase -i (finish): returning to
>> refs/heads/remote-cruft
>>
>> What happened to the rest of my reflog?!
>
> On the filesystem known to not record the last consistent state of
> the repository, the answer to that question may be rather obvious,
> no?
I didn't understand. What does .git/logs/HEAD have to do with any of
this? Why is it truncated?
>> artagnon|master=:~/src/git$ git branch -D remote-cruft
>> error: Couldn't look up commit object for 'refs/heads/remote-cruft'
>
> The command would want to report what was at the tip, so it is
> understandable it may want to look up that commit before removing
> the ref.
I would have expected it to display a warning and remove the ref
anyway. Or error out, and override with a force-flag?
>> Fine, let's fetch.
>
> Why?
>
> "fetch" walks the ancestry graph on both ends to minimize transfers.
> It's not something you would expect to work when you know refs at
> your end does not even record what you do have. It _may_ appear to
> work if your refs are intact but you are missing objects, as they
> will not be transferred again from the good copy if you let your
> repository's ref claim that you have _all_ objects behind it when
> you actually don't.
Right. I expected it to figure out that I have a broken history and
fetch everything (which is what happened the second time, no?).
> What would have been a better starting point to untangle is to make
> a separate clone, pretending as if this repository did not even
> exist, and copy the resulting packfile into this repository. That
> would at least give you a known good copies of objects that you
> already have pushed out.
Yeah, I deliberately avoided doing that: apart from the config and
refs, I had no real unpushed work in ~/src/git anyway (I push _very_
frequently, which explains my "resolve HEAD early in current" patch).
The most important part of what I did was running unpack-objects by
hand: that fixed everything. I shouldn't have had to run that by hand
though: why isn't there an in-built way to unpack everything, remove
corruptions, and repack the good stuff?
> And the next step would have been (without doing any of the above
> "remove this branch, recreate this one anew") to compare the tips
> of refs in this broken repository and the clone. The same ones you
> can trust, and different ones you dig further.
Right. I didn't have local data in this case, so I didn't bother.
>> Was I being stupid, or is fixing corrupted repositories really this
>> non-trivial? Comments appreciated.
>
> I think "Let's fetch first" was the step that took you in a wrong
> direction that requires unnecessary work.
This was mainly a learning exercise for me: I wanted to see how good
git was at working with corrupted repositories. I did my surgery
fairly quickly, and avoided large network transfers (I have a slow
connection).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [HELP] Corrupted repository
2013-06-21 16:44 ` Ramkumar Ramachandra
@ 2013-06-21 19:00 ` Junio C Hamano
2013-06-21 19:15 ` Ramkumar Ramachandra
2013-06-23 17:48 ` Matthieu Moy
0 siblings, 2 replies; 6+ messages in thread
From: Junio C Hamano @ 2013-06-21 19:00 UTC (permalink / raw)
To: Ramkumar Ramachandra; +Cc: Git List
Ramkumar Ramachandra <artagnon@gmail.com> writes:
> Junio C Hamano wrote:
>>> $ ~/src/git
>>> error: object file
>>> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
>>> error: object file
>>> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
>>> fatal: loose object 8e6a6dda24b017915449897fcc1353a9b848fd2f (stored
>>> in .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f) is corrupt
>>
>> So fsync() and close() thought that the filesystem stored this loose
>> object safely, but it turns out that the data is not on disk.
>
> Where should I start digging if I want to fix this? Actually you just
> need to tell me how to build reduced-case corruptions to test: I can
> trace and figure out the rest.
That is a trip in the kernel source, isn't it? I cannot be your
guide there.
>
>>> artagnon|remote-cruft*+:~/src/git$ git prune
>>> artagnon|remote-cruft*+:~/src/git$ git status
>>> fatal: bad object HEAD
>>> fatal: bad object HEAD
>>
>> And the value in the HEAD was???
>
> ref: refs/heads/remote-cruft. That's why I included my prompt :)
>
>>> artagnon|remote-cruft*+:~/src/git$ git symbolic-ref HEAD refs/heads/master
>>> artagnon|master*+=:~/src/git$ git status
>>> ## master
>>> MM Documentation/git-ls-remote.txt
>>> MM remote.c
>>> MM t/t5505-remote.sh
>>> MM t/t5510-fetch.sh
>>> MM t/t5515-fetch-merge-logic.sh
>>> MM t/t5516-fetch-push.sh
>>> ?? lib/
>>> ?? outgoing/
>>>
>>> That status is completely bogus, by the way.
>>
>> ... which may suggest that your index file may have been corrupted
>> on the filesystem.
>
> Yeah, my question pertains to why is the index half-corrupted. Is
> there no checksum to say "index corrupted; do not display bogus
> nonsense"?
Another possibility is perhaps the objects that are referred to by
the index were missing or unreadable, and the index weren't.
>
>>> artagnon|master*+=:~/src/git$ git checkout remote-cruft
>>> fatal: reference is not a tree: remote-cruft
>>> artagnon|master=:~/src/git$ git reflog
>>> 21ff915 HEAD@{10 minutes ago}: rebase -i (finish): returning to
>>> refs/heads/remote-cruft
>>>
>>> What happened to the rest of my reflog?!
>>
>> On the filesystem known to not record the last consistent state of
>> the repository, the answer to that question may be rather obvious,
>> no?
>
> I didn't understand. What does .git/logs/HEAD have to do with any of
> this? Why is it truncated?
Explain why .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f
was truncated to yourself, and the same explanation would apply to
the .git/logs/HEAD file, I think.
> This was mainly a learning exercise for me: I wanted to see how good
> git was at working with corrupted repositories.
You could have just asked ;-).
A tl;dr is that we _trust_ our refs and everything reachable from
them has to be complete. If that is not the case, things will not
work, and it is not a priority to add workarounds in the normal
codepath to slow things down.
That does not forbid an addition of "git recover-corrupted-repo"
command, whose "assume everything might be broken" code is not
shared with the fastpath of other commands.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [HELP] Corrupted repository
2013-06-21 19:00 ` Junio C Hamano
@ 2013-06-21 19:15 ` Ramkumar Ramachandra
2013-06-23 17:48 ` Matthieu Moy
1 sibling, 0 replies; 6+ messages in thread
From: Ramkumar Ramachandra @ 2013-06-21 19:15 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Git List
Junio C Hamano wrote:
> A tl;dr is that we _trust_ our refs and everything reachable from
> them has to be complete. If that is not the case, things will not
> work, and it is not a priority to add workarounds in the normal
> codepath to slow things down.
Makes sense.
> That does not forbid an addition of "git recover-corrupted-repo"
> command, whose "assume everything might be broken" code is not
> shared with the fastpath of other commands.
I'm not looking for a kitchen-sink command: I'm looking for a
well-documented toolset to precisely fix corruptions. We have some
corruption tests in our testsuite already: I think I'll start digging
there.
Thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [HELP] Corrupted repository
2013-06-21 19:00 ` Junio C Hamano
2013-06-21 19:15 ` Ramkumar Ramachandra
@ 2013-06-23 17:48 ` Matthieu Moy
1 sibling, 0 replies; 6+ messages in thread
From: Matthieu Moy @ 2013-06-23 17:48 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Ramkumar Ramachandra, Git List
Junio C Hamano <gitster@pobox.com> writes:
> Ramkumar Ramachandra <artagnon@gmail.com> writes:
>
>> Junio C Hamano wrote:
>>>> $ ~/src/git
>>>> error: object file
>>>> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
>>>> error: object file
>>>> .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f is empty
>>>> fatal: loose object 8e6a6dda24b017915449897fcc1353a9b848fd2f (stored
>>>> in .git/objects/8e/6a6dda24b017915449897fcc1353a9b848fd2f) is corrupt
>>>
>>> So fsync() and close() thought that the filesystem stored this loose
>>> object safely, but it turns out that the data is not on disk.
>>
>> Where should I start digging if I want to fix this? Actually you just
>> need to tell me how to build reduced-case corruptions to test: I can
>> trace and figure out the rest.
>
> That is a trip in the kernel source, isn't it? I cannot be your
> guide there.
Not necessarily. AFAICT, Git won't fsync object files by default, but
does for pack files (to make sure the pack is written before unlinking
the object files being packed):
core.fsyncobjectfiles
This boolean will enable fsync() when writing object files.
This is a total waste of time and effort on a filesystem that
orders data writes properly, but can be useful for
filesystems that do not use journalling (traditional UNIX
filesystems) or that only journal metadata and not file
contents (OS X’s HFS+, or Linux ext3 with "data=writeback").
--
Matthieu Moy
http://www-verimag.imag.fr/~moy/
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-06-23 17:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-21 10:49 [HELP] Corrupted repository Ramkumar Ramachandra
2013-06-21 16:22 ` Junio C Hamano
2013-06-21 16:44 ` Ramkumar Ramachandra
2013-06-21 19:00 ` Junio C Hamano
2013-06-21 19:15 ` Ramkumar Ramachandra
2013-06-23 17:48 ` Matthieu Moy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).