Recovering from repository corruption

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Recovering from repository corruption
@ 2008-06-10 17:26 Denis Bueno
  2008-06-10 17:55 ` Jakub Narebski
  2008-06-10 19:40 ` Recovering from repository corruption Nicolas Pitre
  0 siblings, 2 replies; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 17:26 UTC (permalink / raw)
  To: Git Mailing List

I started a thread a while back about repository corruption.  It
manifested as a clone error and the thread is here:

    http://kerneltrap.org/mailarchive/git/2007/7/31/253475

I just ran, again, into corruption after my laptop kernel-panic'd.
(Ironically, at the moment I ran into the corruption I was trying to
push my repo to a backup location.)  Since that thread took place it
seems a section about recovering from repo corruption was added to the
manual --- but it assumes you can (or care to painstakingly) recreate
each corrupted version.

I made several changes to one file, home.html, and now have the
following corruption:

    identity.corrupt[56] > git fsck --full
    error: 320bd6e82267b71dd2ca7043ea3f61dbbca16109: object corrupt or missing
    error: 4d0be2816d5eea5ae2b40990235e2225c1715927: object corrupt or missing
    missing blob 320bd6e82267b71dd2ca7043ea3f61dbbca16109
    missing blob 4d0be2816d5eea5ae2b40990235e2225c1715927

I know which commits these hashes correspond to, and I know roughly
what I did in those commits, but, I really don't care that much, and
anyway it will be painful to recreate them because of
whitespace/formatting issues.  Here are the commits, in case it is
relevant:

    commit 163a93df14d246dee91c3a503e6372b8313f337d
    Author: Denis Bueno <dbueno@gmail.com>
    Date:   Tue Jun 10 09:45:41 2008 -0400

        Add lambda-the-ultimate link

    :100644 100644 320bd6e... 2ab4775... M  home.html

    [... intervenent commits ...]

    commit 4737fea59fdc8325e09b5206cc7a6ac593446ce3
    Author: Denis Bueno <dbueno@gmail.com>
    Date:   Tue Jun 10 09:37:12 2008 -0400

        Hoogle up top too

    :100644 100644 4d0be28... c6fe111... M  home.html

Assuming I can't recreate the hashed files, what are my options?

I was told in the thread above that I could use grafts and "git
filter-branch" to create a new repository that simply got rid of the
offending object.  That case was simpler, as it was the initial import
of a file that had only two commits total that was corrupted.
However, in this case there are changes between the initial and latest
version of the file, and commits between the corrupted versions, so, I
can imagine that it would be hard to get rid of in-between commits.

The thing that makes sense intuitively (read: not as a Git expert, but
as a user) is to just let me replace the commits associated with the
problematic objects with new versions of those commits (e.g. make
change described in the commit message, which is different from the
actual change that was recorded, due to whitespace/formatting issues).
 Is this what I should do?  And to do so, should I be reading chapter
5 of the manual?

Thanks.

-- 
                              Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 17:26 Recovering from repository corruption Denis Bueno
@ 2008-06-10 17:55 ` Jakub Narebski
  2008-06-10 19:38   ` Denis Bueno
  2008-06-10 19:40 ` Recovering from repository corruption Nicolas Pitre
  1 sibling, 1 reply; 31+ messages in thread
From: Jakub Narebski @ 2008-06-10 17:55 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List

"Denis Bueno" <dbueno@gmail.com> writes:

> I was told in the thread above that I could use grafts and "git
> filter-branch" to create a new repository that simply got rid of the
> offending object.  That case was simpler, as it was the initial import
> of a file that had only two commits total that was corrupted.
> However, in this case there are changes between the initial and latest
> version of the file, and commits between the corrupted versions, so, I
> can imagine that it would be hard to get rid of in-between commits.
> 
> The thing that makes sense intuitively (read: not as a Git expert, but
> as a user) is to just let me replace the commits associated with the
> problematic objects with new versions of those commits (e.g. make
> change described in the commit message, which is different from the
> actual change that was recorded, due to whitespace/formatting issues).
>  Is this what I should do?  And to do so, should I be reading chapter
> 5 of the manual?

Without checking Git User's Manual, I think the solution could go as
the following.

Assume that history looks like this

    ...---.---a---*---b---.---...

where by '*' is marked corruped commit (commit shich tree contains
corrupted blobs).

First, you can check the commit message for '*' using git-cat-file or
git-show, you can get the difference between 'a' and 'b' using 
"git diff a b".  When you know how repaired commit 'X' should look
like, do something like:

  $ git checkout -b <temp-branch> 'a'
  $ <edit edit edit>
  $ git commit

Then history would look like this

    ...---.---a---*---b---.---...
               \
                \-X

Now with grafts make 'b' be a child of 'X', i.e. modify parent of 'b'
for history to look like below:

    ...---.---a---*   b---.---...
               \     /
                \-X-/

Examine history using git-log, git-show, check tree with git-ls-tree
and examining files, use graphical history browser like gitk.

Then if possible use git-filter-branch to make history recorded in
grafts file permanent...

HTH
-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 17:55 ` Jakub Narebski
@ 2008-06-10 19:38   ` Denis Bueno
  2008-06-10 19:59     ` Jakub Narebski
  0 siblings, 1 reply; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 19:38 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Git Mailing List

On Tue, Jun 10, 2008 at 13:55, Jakub Narebski <jnareb@gmail.com> wrote:
> Assume that history looks like this
>
>    ...---.---a---*---b---.---...
>
> where by '*' is marked corruped commit (commit shich tree contains
> corrupted blobs).
>
> First, you can check the commit message for '*' using git-cat-file or
> git-show, you can get the difference between 'a' and 'b' using
> "git diff a b".  When you know how repaired commit 'X' should look
> like, do something like:
>
>  $ git checkout -b <temp-branch> 'a'
>  $ <edit edit edit>
>  $ git commit
>
> Then history would look like this
>
>    ...---.---a---*---b---.---...
>               \
>                \-X
>
> Now with grafts make 'b' be a child of 'X', i.e. modify parent of 'b'
> for history to look like below:
>
>    ...---.---a---*   b---.---...
>               \     /
>                \-X-/
>
> Examine history using git-log, git-show, check tree with git-ls-tree
> and examining files, use graphical history browser like gitk.
>
> Then if possible use git-filter-branch to make history recorded in
> grafts file permanent...
>
> HTH
> --
> Jakub Narebski
> Poland
> ShadeHawk on #git
>

Thanks for the help.

My situation was:

    ...---a---*---b---c---d---*---e---...

Following your example, I believe I got this to:

    ...---a---*   b---c---d---*   e---...
           \     /         \     /
            \-X-/           \---/

That is, I replaced the first problematic commit and deleted the
second, since I forgot how I changed 'd' to get that commit.  I put
the following in .git/info/grafts:

    'b' X
    'e' 'd'

(which I gathered from here:
http://thread.gmane.org/gmane.comp.version-control.git/66398/focus=66402.
 I've never use grafts before.  A bit about them should be put in the
manual, if it's not there already. =])

Then I ran:

    git-filter-branch HEAD ^X ^'d'

Now "git log --raw --all" doesn't show any of the problematic SHA-1
hashes anymore!

However:

identity.fb[173] > git fsck --full
    error: 320bd6e82267b71dd2ca7043ea3f61dbbca16109: object corrupt or missing
    error: 4d0be2816d5eea5ae2b40990235e2225c1715927: object corrupt or missing
    missing blob 320bd6e82267b71dd2ca7043ea3f61dbbca16109
    missing blob 4d0be2816d5eea5ae2b40990235e2225c1715927

Shouldn't these be unreferenced now that I've run filter-branch?

-- 
                              Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 19:38   ` Denis Bueno
@ 2008-06-10 19:59     ` Jakub Narebski
  2008-06-10 20:03       ` Denis Bueno
  0 siblings, 1 reply; 31+ messages in thread
From: Jakub Narebski @ 2008-06-10 19:59 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List

On Tue, 10 Jun 2008, Denis Bueno wrote:

> However:
> 
> identity.fb[173] > git fsck --full
>     error: 320bd6e82267b71dd2ca7043ea3f61dbbca16109: object corrupt or missing
>     error: 4d0be2816d5eea5ae2b40990235e2225c1715927: object corrupt or missing
>     missing blob 320bd6e82267b71dd2ca7043ea3f61dbbca16109
>     missing blob 4d0be2816d5eea5ae2b40990235e2225c1715927
> 
> Shouldn't these be unreferenced now that I've run filter-branch?

Try to clone this repository (using file:/// pseudo-protocol to force
transfer of objects instead of hardlinking them), and chek if the
problem persists in the clone too.  If not, error/missing might be
in "garbage".

But I'm not sure...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 19:59     ` Jakub Narebski
@ 2008-06-10 20:03       ` Denis Bueno
  2008-06-10 20:14         ` Jakub Narebski
  2008-06-10 20:23         ` Linus Torvalds
  0 siblings, 2 replies; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 20:03 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Git Mailing List

On Tue, Jun 10, 2008 at 15:59, Jakub Narebski <jnareb@gmail.com> wrote:
>> Shouldn't these be unreferenced now that I've run filter-branch?
>
> Try to clone this repository (using file:/// pseudo-protocol to force
> transfer of objects instead of hardlinking them), and chek if the
> problem persists in the clone too.  If not, error/missing might be
> in "garbage".
>
> But I'm not sure...

You're onto something:

[dorothy.local /tmp <Tue Jun 10> <16:02:08>]
tmp[176] > git clone file:///Volumes/work/identity.fb/
Initialized empty Git repository in /tmp/identity.fb/.git/
remote: Counting objects: 401, done.
remote: Compressing objects: 100% (364/364), done.
remote: Total 401 (delta 170), reused 0 (delta 0)
Receiving objects: 100% (401/401), 233.76 KiB, done.
Resolving deltas: 100% (170/170), done.

[dorothy.local /tmp <Tue Jun 10> <16:02:22>]
tmp[177] > cd identity.fb/
/tmp/identity.fb

[dorothy.local /tmp/identity.fb <Tue Jun 10> <16:02:24>]
identity.fb[178] > git fsck --full
broken link from  commit 4737fea59fdc8325e09b5206cc7a6ac593446ce3
              to  commit fe431b4b69453ad9207a5528cf9b9d12ef69c988
dangling commit 28aa69aafc8ae901e588f6d341b3e6d3558c6d26
dangling commit 884a8024fbcb9367726abb25f8bb6ac539712d46
missing commit fe431b4b69453ad9207a5528cf9b9d12ef69c988

But I've just substituted one error for another.  Are these errors
easier to fix?


-- 
                              Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 20:03       ` Denis Bueno
@ 2008-06-10 20:14         ` Jakub Narebski
  2008-06-10 20:35           ` Denis Bueno
  2008-06-10 20:23         ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Jakub Narebski @ 2008-06-10 20:14 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List

On Tue, 10 Jun 2008, Denis Bueno wrote:
> On Tue, Jun 10, 2008, Jakub Narebski <jnareb@gmail.com> wrote: 
>> Denis Bueno wrote:
>>>
>>> Shouldn't these be unreferenced now that I've run filter-branch?
>>
>> Try to clone this repository (using file:/// pseudo-protocol to force 
>> transfer of objects instead of hardlinking them), and chek if the
>> problem persists in the clone too.  If not, error/missing might be
>> in "garbage".
>>
>> But I'm not sure...
> 
> You're onto something:
> 
> [dorothy.local /tmp <Tue Jun 10> <16:02:08>]
> tmp[176] > git clone file:///Volumes/work/identity.fb/
> Initialized empty Git repository in /tmp/identity.fb/.git/
> remote: Counting objects: 401, done.
> remote: Compressing objects: 100% (364/364), done.
> remote: Total 401 (delta 170), reused 0 (delta 0)
> Receiving objects: 100% (401/401), 233.76 KiB, done.
> Resolving deltas: 100% (170/170), done.
> 
> [dorothy.local /tmp <Tue Jun 10> <16:02:22>]
> tmp[177] > cd identity.fb/
> /tmp/identity.fb
> 
> [dorothy.local /tmp/identity.fb <Tue Jun 10> <16:02:24>]
> identity.fb[178] > git fsck --full
> broken link from  commit 4737fea59fdc8325e09b5206cc7a6ac593446ce3
>               to  commit fe431b4b69453ad9207a5528cf9b9d12ef69c988
> dangling commit 28aa69aafc8ae901e588f6d341b3e6d3558c6d26
> dangling commit 884a8024fbcb9367726abb25f8bb6ac539712d46
> missing commit fe431b4b69453ad9207a5528cf9b9d12ef69c988
> 
> But I've just substituted one error for another.  Are these errors
> easier to fix?

Please remember that in such clone you _don't_ have grafts info (unless
you copy it manually), so it is a good test if you correctly rewrote 
history using git-filter-branch.  So take a look at history in your 
clone using gitk or some similar tool.

In the history you mentioned:

    ...---a---*   b---c---d---*   e---...
           \     /         \     /
            \-X-/           \---/

you should rewritr from 'a'=='X^' to, and including 'e' (and not only 
from 'd').


But if it is not the case I'm afraid I wouldn't be able to offer any 
further insight...

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 20:14         ` Jakub Narebski
@ 2008-06-10 20:35           ` Denis Bueno
  0 siblings, 0 replies; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 20:35 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Git Mailing List

On Tue, Jun 10, 2008 at 16:14, Jakub Narebski <jnareb@gmail.com> wrote:
> Please remember that in such clone you _don't_ have grafts info (unless
> you copy it manually), so it is a good test if you correctly rewrote
> history using git-filter-branch.  So take a look at history in your
> clone using gitk or some similar tool.
>
> In the history you mentioned:
>
>    ...---a---*   b---c---d---*   e---...
>           \     /         \     /
>            \-X-/           \---/
>
> you should rewritr from 'a'=='X^' to, and including 'e' (and not only
> from 'd').

So I re-did the filter-branch as:

    git-filter-branch HEAD
28aa69aafc8ae901e588f6d341b3e6d3558c6d26^..163a93df14d246dee91c3a503e6372b8313f337d

Now cloning still works and only shows dangling commits --- no errors!

-- 
                              Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 20:03       ` Denis Bueno
  2008-06-10 20:14         ` Jakub Narebski
@ 2008-06-10 20:23         ` Linus Torvalds
  2008-06-10 20:28           ` Denis Bueno
  1 sibling, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2008-06-10 20:23 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Jakub Narebski, Git Mailing List

On Tue, 10 Jun 2008, Denis Bueno wrote:
>
> You're onto something:
> 
> [dorothy.local /tmp <Tue Jun 10> <16:02:08>]
> tmp[176] > git clone file:///Volumes/work/identity.fb/

[ successful ]

Hmm. Scary. That should *not* have been successful with a corrupt repo.

Unless you have done a .grafts file to hide the corruption, or something 
like that?

Have you saved away the original corrupt repo (the whole .git directory as 
a tar-ball, for example)? And is the data public and non-embarrassing 
enough so that you could make it available for some post-corruption 
analysis? Even if we cannot help recover it, real-life corruption is 
always interesting to see if only as a test-case to make sure that git 
notices it as quickly as possible.

			Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 20:23         ` Linus Torvalds
@ 2008-06-10 20:28           ` Denis Bueno
  2008-06-10 21:09             ` Linus Torvalds
  0 siblings, 1 reply; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 20:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

On Tue, Jun 10, 2008 at 16:23, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Tue, 10 Jun 2008, Denis Bueno wrote:
>>
>> You're onto something:
>>
>> [dorothy.local /tmp <Tue Jun 10> <16:02:08>]
>> tmp[176] > git clone file:///Volumes/work/identity.fb/
>
> [ successful ]
>
> Hmm. Scary. That should *not* have been successful with a corrupt repo.
>
> Unless you have done a .grafts file to hide the corruption, or something
> like that?

I intended to do that, yes, and I think I was successful.  (I only say
I "intended to" --- instead of "I did" --- because I read the
documentation for the grafts file elsewhere on this list, and not in
some more "blessed" location.)

> Have you saved away the original corrupt repo (the whole .git directory as
> a tar-ball, for example)? And is the data public and non-embarrassing
> enough so that you could make it available for some post-corruption
> analysis? Even if we cannot help recover it, real-life corruption is
> always interesting to see if only as a test-case to make sure that git
> notices it as quickly as possible.

I do have bunches of personal information in the repo, unfortunately.
The particular *file* involved in the corruption, however, is fine for
all to view.  Is that useful?


-- 
                              Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 20:28           ` Denis Bueno
@ 2008-06-10 21:09             ` Linus Torvalds
  2008-06-10 21:22               ` Denis Bueno
                                 ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Linus Torvalds @ 2008-06-10 21:09 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List

On Tue, 10 Jun 2008, Denis Bueno wrote:
> >
> > Hmm. Scary. That should *not* have been successful with a corrupt repo.
> >
> > Unless you have done a .grafts file to hide the corruption, or something
> > like that?
> 
> I intended to do that, yes, and I think I was successful.

Ahh, ok. Yes, we should probably re-think our 'grafts' file thing, or at 
least not document it, because it's actually a wondeful way to just cause 
more corruption by hiding things (ie if you clone a repo with a grafts 
file, the result will now have neither the grafts file _nor_ the state 
that was hidden by it, so the result is guaranteed to be corrupt).

But that explains why your clone worked, and why the resulting repo had 
different corruption - it avoided the original corruption, but because of 
the grafts file it avoided it by just not having those commits at all..

> I do have bunches of personal information in the repo, unfortunately.
> The particular *file* involved in the corruption, however, is fine for
> all to view.  Is that useful?

No, almost all the interest is basically in how the whole repo ties 
together. The individual corrupt files may be interesting, though, ie from 
your original report:

    error: 320bd6e82267b71dd2ca7043ea3f61dbbca16109: object corrupt or missing
    error: 4d0be2816d5eea5ae2b40990235e2225c1715927: object corrupt or missing

then *if* you have the files

	.git/objects/32/0bd6e82267b71dd2ca7043ea3f61dbbca16109
	.git/objects/4d/0be2816d5eea5ae2b40990235e2225c1715927

then those two files are interesting in themselves (most likely they are 
not there at all, or are zero-sized, but if you have them, please post 
them).

And as this was a result of a real filesystem crash, it *is* possible that 
you have something in the /lost+found directory for that filesystem. If 
so, those missing files may be found there.

		Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 21:09             ` Linus Torvalds
@ 2008-06-10 21:22               ` Denis Bueno
  2008-06-10 21:48                 ` Linus Torvalds
  2008-06-10 21:27               ` Denis Bueno
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 21:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

[-- Attachment #1: Type: text/plain, Size: 1054 bytes --]

On Tue, Jun 10, 2008 at 17:09, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> No, almost all the interest is basically in how the whole repo ties
> together. The individual corrupt files may be interesting, though, ie from
> your original report:
>
>    error: 320bd6e82267b71dd2ca7043ea3f61dbbca16109: object corrupt or missing
>    error: 4d0be2816d5eea5ae2b40990235e2225c1715927: object corrupt or missing
>
> then *if* you have the files
>
>        .git/objects/32/0bd6e82267b71dd2ca7043ea3f61dbbca16109
>        .git/objects/4d/0be2816d5eea5ae2b40990235e2225c1715927
>
> then those two files are interesting in themselves (most likely they are
> not there at all, or are zero-sized, but if you have them, please post
> them).

They are attached, and they are not zero-sized.

> And as this was a result of a real filesystem crash, it *is* possible that
> you have something in the /lost+found directory for that filesystem. If
> so, those missing files may be found there.

I checked; no such luck.

-- 
                              Denis

[-- Attachment #2: 0bd6e82267b71dd2ca7043ea3f61dbbca16109 --]
[-- Type: application/octet-stream, Size: 2145 bytes --]

[-- Attachment #3: 0be2816d5eea5ae2b40990235e2225c1715927 --]
[-- Type: application/octet-stream, Size: 2110 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 21:22               ` Denis Bueno
@ 2008-06-10 21:48                 ` Linus Torvalds
  2008-06-10 22:09                   ` Denis Bueno
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2008-06-10 21:48 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List

On Tue, 10 Jun 2008, Denis Bueno wrote:
> >
> > then *if* you have the files
> >
> >        .git/objects/32/0bd6e82267b71dd2ca7043ea3f61dbbca16109
> >        .git/objects/4d/0be2816d5eea5ae2b40990235e2225c1715927
> >
> > then those two files are interesting in themselves (most likely they are
> > not there at all, or are zero-sized, but if you have them, please post
> > them).
> 
> They are attached, and they are not zero-sized.

Very interesting.

Both of them look fairly sane as objects (ie random - it's supposed to eb 
zlib-compressed), but both of them have the first 512 bytes *identically* 
corrupted:

	0000000 6564 626e 6575 406e 6f64 6f72 6874 2e79
	          d   e   n   b   u   e   n   @   d   o   r   o   t   h   y   .
	0000020 6f6c 6163 2e6c 3634 0033 0000 0000 0000
	          l   o   c   a   l   .   4   6   3  \0  \0  \0  \0  \0  \0  \0
	0000040 0000 0000 0000 0000 0000 0000 0000 0000
	         \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
	*

ie it's an all-zero block, except for that email-looking thing at the 
head. 

Sadly, I don't think there is any way to find the missing block that got 
overwritten. And quite frankly, there's no way to really know whether the 
rest was really fine either - it just looks more likely, but quite 
frankly, it could have been random old contents on your disk too that just 
happens to look like the expected random pattern (which you'll get with 
any compression format - compression by definition removes patterns).

One thign that strikes me is that you seem to be really prone to this 
problem, since it happened to you a year ago too. I cannot swear to this, 
but I literally suspect your last case (July-2007) was the previous time 
we had a corruption issue. Why does it seem to happen to you, but not 
others?

Do you have some odd filesystem in play? Was the current corruption in a 
similar environment as the old one? IOW, I'm trying to find a pattern 
here, to see if there might be something we can do about it..

But it *sounds* like the objects you lost were literally old ones, no? Ie 
the lost stuff wasn't something you had committed in the last five minutes 
or so? If so, then you really do seem to have a filesystem that corrupts 
*old* files when it crashes. That's fairly scary. What FS is it?

		Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 21:48                 ` Linus Torvalds
@ 2008-06-10 22:09                   ` Denis Bueno
  2008-06-10 22:25                     ` Tarmigan
  2008-06-10 22:45                     ` Linus Torvalds
  0 siblings, 2 replies; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 22:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

On Tue, Jun 10, 2008 at 17:48, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Tue, 10 Jun 2008, Denis Bueno wrote:
>> >
>> > then *if* you have the files
>> >
>> >        .git/objects/32/0bd6e82267b71dd2ca7043ea3f61dbbca16109
>> >        .git/objects/4d/0be2816d5eea5ae2b40990235e2225c1715927
>> >
>> > then those two files are interesting in themselves (most likely they are
>> > not there at all, or are zero-sized, but if you have them, please post
>> > them).
>>
>> They are attached, and they are not zero-sized.
>
> Very interesting.
>
> Both of them look fairly sane as objects (ie random - it's supposed to eb
> zlib-compressed), but both of them have the first 512 bytes *identically*
> corrupted:
>
>        0000000 6564 626e 6575 406e 6f64 6f72 6874 2e79
>                  d   e   n   b   u   e   n   @   d   o   r   o   t   h   y   .
>        0000020 6f6c 6163 2e6c 3634 0033 0000 0000 0000
>                  l   o   c   a   l   .   4   6   3  \0  \0  \0  \0  \0  \0  \0
>        0000040 0000 0000 0000 0000 0000 0000 0000 0000
>                 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
>        *
>
> ie it's an all-zero block, except for that email-looking thing at the
> head.

Right --- that's my username and computer's hostname... for some
reason.  [You are not expected to understand this.  My computer's name
mysteriously changed.  It should not be "dorothy.local" but it is.  I
will have to find out why....]

> One thign that strikes me is that you seem to be really prone to this
> problem, since it happened to you a year ago too. I cannot swear to this,
> but I literally suspect your last case (July-2007) was the previous time
> we had a corruption issue. Why does it seem to happen to you, but not
> others?

It is the same computer on which the problem occurred last time.  It's
an OS X 10.4 macbook pro.  I haven't noticed corruption in other
places, but it's fair to assume it's occurring.  I'll have to boot off
my install disk and fsck the drive....

> Do you have some odd filesystem in play? Was the current corruption in a
> similar environment as the old one? IOW, I'm trying to find a pattern
> here, to see if there might be something we can do about it..

I can't remember if the old one happened after a panic or not, but I'd
bet it did.  The filesystem is HFS+, as indeed most OS X 10.4
installations are.  Maybe the HD has been going south?  However, that
doesn't seem likely, since when I got the computer it was new, and
that was around Jun 2007.

> But it *sounds* like the objects you lost were literally old ones, no? Ie
> the lost stuff wasn't something you had committed in the last five minutes
> or so? If so, then you really do seem to have a filesystem that corrupts
> *old* files when it crashes. That's fairly scary. What FS is it?

No, in fact I had just committed those changes not 10 minutes before
the panic.  Last time they were also fresh changes, although perhaps
older than 10 minutes.  I can't remember.


-- 
 Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 22:09                   ` Denis Bueno
@ 2008-06-10 22:25                     ` Tarmigan
  2008-06-10 22:41                       ` Denis Bueno
  2008-06-10 22:45                     ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Tarmigan @ 2008-06-10 22:25 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Linus Torvalds, Git Mailing List

On Tue, Jun 10, 2008 at 3:09 PM, Denis Bueno <dbueno@gmail.com> wrote:
> It is the same computer on which the problem occurred last time.  It's
> an OS X 10.4 macbook pro.  I haven't noticed corruption in other
> places, but it's fair to assume it's occurring.  I'll have to boot off
> my install disk and fsck the drive....

Do you have fink installed?  Do you have the openssl fink package
installed?  Vger seems to have swallowed my original reply, but see
this thread:
http://marc.info/?l=git&m=120787191106549&w=2
If so, try removing the fink openssl packages and reinstalling git.

Do you push from this machine often?  If you do, then this probably is
not your problem as you would have seen it earlier.

-Tarmigan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 22:25                     ` Tarmigan
@ 2008-06-10 22:41                       ` Denis Bueno
  0 siblings, 0 replies; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 22:41 UTC (permalink / raw)
  To: Tarmigan; +Cc: Linus Torvalds, Git Mailing List

On Tue, Jun 10, 2008 at 18:25, Tarmigan <tarmigan+git@gmail.com> wrote:
> Do you have fink installed?  Do you have the openssl fink package
> installed?  Vger seems to have swallowed my original reply, but see
> this thread:
> http://marc.info/?l=git&m=120787191106549&w=2
> If so, try removing the fink openssl packages and reinstalling git.

I use macports.

> Do you push from this machine often?  If you do, then this probably is
> not your problem as you would have seen it earlier.

Yes, almost exclusively.  ... That is an odd problem.  Thanks for the
suggestion.

-- 
 Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 22:09                   ` Denis Bueno
  2008-06-10 22:25                     ` Tarmigan
@ 2008-06-10 22:45                     ` Linus Torvalds
  2008-06-10 23:00                       ` Linus Torvalds
  2008-06-11  0:43                       ` Nicolas Pitre
  1 sibling, 2 replies; 31+ messages in thread
From: Linus Torvalds @ 2008-06-10 22:45 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List

On Tue, 10 Jun 2008, Denis Bueno wrote:
> 
> > Do you have some odd filesystem in play? Was the current corruption in a
> > similar environment as the old one? IOW, I'm trying to find a pattern
> > here, to see if there might be something we can do about it..
> 
> I can't remember if the old one happened after a panic or not, but I'd
> bet it did.  The filesystem is HFS+, as indeed most OS X 10.4
> installations are.  Maybe the HD has been going south?  However, that
> doesn't seem likely, since when I got the computer it was new, and
> that was around Jun 2007.

Yeah, it's almost certainly not the disk. Disks do go bad, but the 
behavior tends to be rather different when they do (usually you will get 
read errors with uncorrectably CRC failures, and you'd know that _very_ 
clearly).

Sure, I could imagine something like the sector remapping could be flaking 
out on you, but that sounds really unlikely. Especially since:

> > But it *sounds* like the objects you lost were literally old ones, no? Ie
> > the lost stuff wasn't something you had committed in the last five minutes
> > or so? If so, then you really do seem to have a filesystem that corrupts
> > *old* files when it crashes. That's fairly scary. What FS is it?
> 
> No, in fact I had just committed those changes not 10 minutes before
> the panic.  Last time they were also fresh changes, although perhaps
> older than 10 minutes.  I can't remember.

Oh, ok. If so, then this is much less worrisome, and is in fact almost 
"normal" HFS+ behaviour. It is a journaling filesystem, but it only 
journals metadata, so the filenames and inodes will be fine after a crash, 
but the contents will be random.

[ Yeah, yeah, I know - it sounds rather stupid, but it's a common kind of 
  stupidity. The journaling essentially protects the only thing that fsck 
  can find. Ext3 does similar things in "writeback" mode - but you should 
  use "data=ordered" which writes out the data before metadata.

  Basically, such journaling doesn't help data integrity per se, but it 
  does mean that the metadata is ok, and that in turn means that while the 
  file contents won't be dependable, at least things like free block 
  bitmaps etc hopefully are.

  That in turn hopefully means that new file allocations won't be 
  crapping out all over old ones etc due to bad resource allocations, so 
  while it doesn't mean that the data is trust-worthy, it at least means 
  that you can trust _some_ things ]

If your machine crashes often, you could trivially add a "sync" to your 
commit hook. That would make things better. And maybe we should have a 
"safe mode" that does these things more carefully. You would definitely 
want to turn it on on that machine.

Are you doing something special to make the machine crash so much? Or do 
OS X machines always crash, and Apple PR is just so good that people 
aren't aware of it?

Anyway, I'll think about sane ways to add a "safe" mode without making it 
_too_ painful. In the meantime, here's a trial patch that you should 
probably use. It does slow things down, but hopefully not too much.

(I really don't much like it - but I think this is a good change, and I 
just need to come up with a better way to do the fsync() than to be 
totally synchronous about it.)

It's going to make big "git add" calls *much* slower, so I'm not very 
happy about it (especially since we don't actually care that deeply about 
the files really being there until much later, so doing something 
asynchronous would be perfectly acceptable), but for you this is 
definitely worth-while.

			Linus

---
 sha1_file.c |   17 +++++++++++------
 1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index adcf37c..86a653b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2105,6 +2105,15 @@ int hash_sha1_file(const void *buf, unsigned long len, const char *type,
 	return 0;
 }

+/* Finalize a file on disk, and close it. */
+static void close_sha1_file(int fd)
+{
+	fsync_or_die(fd, "sha1 file");
+	fchmod(fd, 0444);
+	if (close(fd) != 0)
+		die("unable to write sha1 file");
+}
+
 static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
 			      void *buf, unsigned long len, time_t mtime)
 {
@@ -2170,9 +2179,7 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,

 	if (write_buffer(fd, compressed, size) < 0)
 		die("unable to write sha1 file");
-	fchmod(fd, 0444);
-	if (close(fd))
-		die("unable to write sha1 file");
+	close_sha1_file(fd);
 	free(compressed);

 	if (mtime) {
@@ -2350,9 +2357,7 @@ int write_sha1_from_fd(const unsigned char *sha1, int fd, char *buffer,
 	} while (1);
 	inflateEnd(&stream);

-	fchmod(local, 0444);
-	if (close(local) != 0)
-		die("unable to write sha1 file");
+	close_sha1_file(local);
 	SHA1_Final(real_sha1, &c);
 	if (ret != Z_STREAM_END) {
 		unlink(tmpfile);

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 22:45                     ` Linus Torvalds
@ 2008-06-10 23:00                       ` Linus Torvalds
  2008-06-11  0:43                       ` Nicolas Pitre
  1 sibling, 0 replies; 31+ messages in thread
From: Linus Torvalds @ 2008-06-10 23:00 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List



On Tue, 10 Jun 2008, Linus Torvalds wrote:
> 
> It's going to make big "git add" calls *much* slower, so I'm not very 
> happy about it (especially since we don't actually care that deeply about 
> the files really being there until much later, so doing something 
> asynchronous would be perfectly acceptable), but for you this is 
> definitely worth-while.

For me, on the whole kernel, on a pretty good system:

 - before:

	[torvalds@woody test-it-out]$ time git add .

	real    0m7.986s
	user    0m6.404s
	sys     0m1.456s

 - after:

	[torvalds@woody test-it-out]$ time ~/git/git-add .

	real    0m52.693s
	user    0m7.416s
	sys     0m2.516s

so it's definitely quite noticeable in that simplistic form. 

A more interesting patch would use aio_fsync(), and then just wait for 
them at the end with aio_return(). Not that I love AIO, but this is 
definitely a case where it would make sense to do (of course, systems 
without AIO support would then fall back to regular fsync()).

I will have to think about this.

			Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 22:45                     ` Linus Torvalds
  2008-06-10 23:00                       ` Linus Torvalds
@ 2008-06-11  0:43                       ` Nicolas Pitre
  2008-06-11  1:39                         ` Linus Torvalds
  1 sibling, 1 reply; 31+ messages in thread
From: Nicolas Pitre @ 2008-06-11  0:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Denis Bueno, Git Mailing List

On Tue, 10 Jun 2008, Linus Torvalds wrote:

> Anyway, I'll think about sane ways to add a "safe" mode without making it 
> _too_ painful. In the meantime, here's a trial patch that you should 
> probably use. It does slow things down, but hopefully not too much.
> 
> (I really don't much like it - but I think this is a good change, and I 
> just need to come up with a better way to do the fsync() than to be 
> totally synchronous about it.)
> 
> It's going to make big "git add" calls *much* slower, so I'm not very 
> happy about it (especially since we don't actually care that deeply about 
> the files really being there until much later, so doing something 
> asynchronous would be perfectly acceptable), but for you this is 
> definitely worth-while.

I don't like it at all.

I think this only gives a false sense of security with a huge 
performance cost.  If the machine crashes at the right moment, the 
object will still be half written/fsync'd and you'll be in the same 
situation again.

And because we don't overwrite existing objects (again for performance 
reasons), then a corrupted blob object will remain corrupted even if you 
reattempt the commit later.  So doing the fsync only when the commit 
object is written isn't a good solution either.

I wonder if supporting crashy systems is worth that cost.  If Denis' 
laptop is the odd case then a sync in the commit hook might be plenty 
sufficient.  Personally I'd simply replace the OS or the machine for 
something more reliable.

Nicolas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-11  0:43                       ` Nicolas Pitre
@ 2008-06-11  1:39                         ` Linus Torvalds
  2008-06-11  1:47                           ` Nicolas Pitre
  0 siblings, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2008-06-11  1:39 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Denis Bueno, Git Mailing List



On Tue, 10 Jun 2008, Nicolas Pitre wrote:
> 
> I think this only gives a false sense of security with a huge 
> performance cost.  If the machine crashes at the right moment, the 
> object will still be half written/fsync'd and you'll be in the same 
> situation again.

No you wouldn't.

We do the write and the fsync() of the write to a _temporary_ filename. We 
do the rename _after_ the fsync.

So you'd never have a half-written object file.

That said, I do agree that the bigger problem is that Denis' machine is 
simply so unreliable.

			Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-11  1:39                         ` Linus Torvalds
@ 2008-06-11  1:47                           ` Nicolas Pitre
  0 siblings, 0 replies; 31+ messages in thread
From: Nicolas Pitre @ 2008-06-11  1:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Denis Bueno, Git Mailing List

On Tue, 10 Jun 2008, Linus Torvalds wrote:

> 
> 
> On Tue, 10 Jun 2008, Nicolas Pitre wrote:
> > 
> > I think this only gives a false sense of security with a huge 
> > performance cost.  If the machine crashes at the right moment, the 
> > object will still be half written/fsync'd and you'll be in the same 
> > situation again.
> 
> No you wouldn't.
> 
> We do the write and the fsync() of the write to a _temporary_ filename. We 
> do the rename _after_ the fsync.

Ah, true.  That part somehow evaded my mind.


Nicolas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 21:09             ` Linus Torvalds
  2008-06-10 21:22               ` Denis Bueno
@ 2008-06-10 21:27               ` Denis Bueno
  2008-06-10 22:52               ` Junio C Hamano
  2008-06-11 23:21               ` To graft or not to graft... (Re: Recovering from repository corruption) Stephen R. van den Berg
  3 siblings, 0 replies; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 21:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List

On Tue, Jun 10, 2008 at 17:09, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Ahh, ok. Yes, we should probably re-think our 'grafts' file thing, or at
> least not document it, because it's actually a wondeful way to just cause
> more corruption by hiding things (ie if you clone a repo with a grafts
> file, the result will now have neither the grafts file _nor_ the state
> that was hidden by it, so the result is guaranteed to be corrupt).

I'd argue in favor of documenting it, even if it's dangerous, unless
there's some other mechanism (rebase?) that would let me do what I
did?  That is, to recover from corruption in a way that lets me
regenerate or ignore inexact, corrupted commits.

-- 
 Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 21:09             ` Linus Torvalds
  2008-06-10 21:22               ` Denis Bueno
  2008-06-10 21:27               ` Denis Bueno
@ 2008-06-10 22:52               ` Junio C Hamano
  2008-06-11 23:21               ` To graft or not to graft... (Re: Recovering from repository corruption) Stephen R. van den Berg
  3 siblings, 0 replies; 31+ messages in thread
From: Junio C Hamano @ 2008-06-10 22:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Denis Bueno, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> Ahh, ok. Yes, we should probably re-think our 'grafts' file thing, or at 
> least not document it, because it's actually a wondeful way to just cause 
> more corruption by hiding things (ie if you clone a repo with a grafts 
> file, the result will now have neither the grafts file _nor_ the state 
> that was hidden by it, so the result is guaranteed to be corrupt).

"Graft and then clone" will not make the copied repository Ok.  You need
to propagate the graft in some other way.

However, "Graft and then filter-branch" is a way to hide and get rid of
the the broken thing in history etched in the objects.  After that the
repository itself and a clone from it will not need the graft.  So I'd
rather argue we should document it _differently_ (or just _better_) than
not document it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* To graft or not to graft... (Re: Recovering from repository corruption)
  2008-06-10 21:09             ` Linus Torvalds
                                 ` (2 preceding siblings ...)
  2008-06-10 22:52               ` Junio C Hamano
@ 2008-06-11 23:21               ` Stephen R. van den Berg
  2008-06-11 23:34                 ` Jakub Narebski
  2008-06-11 23:39                 ` Linus Torvalds
  3 siblings, 2 replies; 31+ messages in thread
From: Stephen R. van den Berg @ 2008-06-11 23:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Denis Bueno, Git Mailing List

Linus Torvalds wrote:
>more corruption by hiding things (ie if you clone a repo with a grafts 
>file, the result will now have neither the grafts file _nor_ the state 
>that was hidden by it, so the result is guaranteed to be corrupt).

This is kind of confusing.
As I understood it from the few shreds of documentation that actually
mention the grafts file, the grafts file is *not* being cloned.
Therefore, my assumption was that cloning a repository that has a grafts
file gives an identical result to cloning the same repository *without*
the grafts file present.

As I understand it now, the cloning process actually peeks at the grafts
file while cloning, and then doesn't copy it.  This results in a rather
confusingly corrupt clone.

I suggest two things:
a. That during the cloning process, the grafts file is completely
   disregarded in any case at first.
b. Preferably the grafts file is copied as well (after cloning).  I
   never really understood why the file is not being copied in the first
   place (anyone care to explain that?).
-- 
Sincerely,
           Stephen R. van den Berg.

Differentiation is an integral part of calculus.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: To graft or not to graft... (Re: Recovering from repository corruption)
  2008-06-11 23:21               ` To graft or not to graft... (Re: Recovering from repository corruption) Stephen R. van den Berg
@ 2008-06-11 23:34                 ` Jakub Narebski
  2008-06-11 23:39                 ` Linus Torvalds
  1 sibling, 0 replies; 31+ messages in thread
From: Jakub Narebski @ 2008-06-11 23:34 UTC (permalink / raw)
  To: Stephen R. van den Berg; +Cc: Linus Torvalds, Denis Bueno, Git Mailing List

"Stephen R. van den Berg" <srb@cuci.nl> writes:

> This is kind of confusing.
>
> As I understood it from the few shreds of documentation that actually
> mention the grafts file, the grafts file is *not* being cloned.
> Therefore, my assumption was that cloning a repository that has a grafts
> file gives an identical result to cloning the same repository *without*
> the grafts file present.
> 
> As I understand it now, the cloning process actually peeks at the grafts
> file while cloning, and then doesn't copy it.  This results in a rather
> confusingly corrupt clone.
> 
> I suggest two things:
> a. That during the cloning process, the grafts file is completely
>    disregarded in any case at first.
> b. Preferably the grafts file is copied as well (after cloning).  I
>    never really understood why the file is not being copied in the first
>    place (anyone care to explain that?).

A bit of explanation: initially I think grafts were created as a means
to "graft" historical repository (conversion from BitKeeper and from
patches) to current work repository (from when git was deemed suitable
as SCM for Linux kernel development).  Nevertheless the machenism is
generic enough to change history _locally_ in many strange ways (for
example shallow clone uses kind of grafts).

Because graft file can be used to alter history, this totally
_bypases_ the check given by sha1 of commit and cryptographically
signed tags.  It negates security given by sha-1 signing.  That's why
using grafs must be _conscious_ decision - therefore they are purely
local and not propagated.

(Also there were no place for grafts in the "smart" trasport, i.e. git
and ssh protocols.  Thinking about what happens if both sides have
grafs files which differ...)

On the other hand history _without_ grafts might not validate.  I
think that it is why current confusing behavior...

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: To graft or not to graft... (Re: Recovering from repository corruption)
  2008-06-11 23:21               ` To graft or not to graft... (Re: Recovering from repository corruption) Stephen R. van den Berg
  2008-06-11 23:34                 ` Jakub Narebski
@ 2008-06-11 23:39                 ` Linus Torvalds
  2008-06-12  7:14                   ` Johan Herland
  1 sibling, 1 reply; 31+ messages in thread
From: Linus Torvalds @ 2008-06-11 23:39 UTC (permalink / raw)
  To: Stephen R. van den Berg; +Cc: Denis Bueno, Git Mailing List



On Thu, 12 Jun 2008, Stephen R. van den Berg wrote:
>
> As I understood it from the few shreds of documentation that actually
> mention the grafts file, the grafts file is *not* being cloned.
> Therefore, my assumption was that cloning a repository that has a grafts
> file gives an identical result to cloning the same repository *without*
> the grafts file present.

That would probably be the right behaviour, but no - all our commit 
walkers honor the grafts file.

Including the ones used for creating pack-files and thus a clone.

> As I understand it now, the cloning process actually peeks at the grafts
> file while cloning, and then doesn't copy it.  This results in a rather
> confusingly corrupt clone.

Yes. The grafts-file was a mistake, but it's just barely useful to some 
people that it's stayed alive. Sadly, those "some people" don't tend to 
care enough about the problems it can cause.

> I suggest two things:
> a. That during the cloning process, the grafts file is completely
>    disregarded in any case at first.

Yes.

And (a'): git-fsck and repacking should just consider it to be an 
_additional_ source of parenthood rather than a _replacement_ source.

> b. Preferably the grafts file is copied as well (after cloning).  I
>    never really understood why the file is not being copied in the first
>    place (anyone care to explain that?).

The grafts file isn't part of the object stream and refs, and clones (and 
fetches) very much just copy the object database.

		Linus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: To graft or not to graft... (Re: Recovering from repository corruption)
  2008-06-11 23:39                 ` Linus Torvalds
@ 2008-06-12  7:14                   ` Johan Herland
  2008-06-12  7:47                     ` Jeff King
  0 siblings, 1 reply; 31+ messages in thread
From: Johan Herland @ 2008-06-12  7:14 UTC (permalink / raw)
  To: git; +Cc: Linus Torvalds, Stephen R. van den Berg, Denis Bueno

On Thursday 12 June 2008, Linus Torvalds wrote:
> On Thu, 12 Jun 2008, Stephen R. van den Berg wrote:
> > As I understood it from the few shreds of documentation that actually
> > mention the grafts file, the grafts file is *not* being cloned.
> > Therefore, my assumption was that cloning a repository that has a
> > grafts file gives an identical result to cloning the same repository
> > *without* the grafts file present.
>
> That would probably be the right behaviour, but no - all our commit
> walkers honor the grafts file.
>
> Including the ones used for creating pack-files and thus a clone.
>
> > As I understand it now, the cloning process actually peeks at the
> > grafts file while cloning, and then doesn't copy it.  This results in a
> > rather confusingly corrupt clone.
>
> Yes. The grafts-file was a mistake, but it's just barely useful to some
> people that it's stayed alive. Sadly, those "some people" don't tend to
> care enough about the problems it can cause.
>
> > I suggest two things:
> > a. That during the cloning process, the grafts file is completely
> >    disregarded in any case at first.
>
> Yes.
>
> And (a'): git-fsck and repacking should just consider it to be an
> _additional_ source of parenthood rather than a _replacement_ source.
>
> > b. Preferably the grafts file is copied as well (after cloning).  I
> >    never really understood why the file is not being copied in the
> > first place (anyone care to explain that?).
>
> The grafts file isn't part of the object stream and refs, and clones (and
> fetches) very much just copy the object database.

AFAICS, there's already a perfectly fine way to distribute grafted history:
1. Add a grafts file
2. Run git-filter-branch
3. Remove grafts file
4. Distribute repo
5. Profit!

Since git-filter-branch turns grafted parentage into _real_ parentage,
there's no point in ever having a grafts file at all (except transiently
for telling git-filter-branch what to do).

I suggest we make commit walkers NOT obey the grafts file by default, but
instead require a --follow-grafts option to restore the current behaviour.
Then, we teach git-filter-branch to obey the grafts file (probably by
employing said --follow-grafts option).

For those who want to hang on to the current behaviour, they can create
some config option that is equivalent to always running with
--follow-grafts.


The following is ugly, untested, undocumented, and obviously unfit for
inclusion:


diff --git a/commit.c b/commit.c
index 94d5b3d..3e9ebf7 100644
--- a/commit.c
+++ b/commit.c
@@ -7,6 +7,7 @@
 #include "revision.h"
 
 int save_commit_buffer = 1;
+int use_grafts = 0;
 
 const char *commit_type = "commit";
 
@@ -242,7 +243,7 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
 	char *bufptr = buffer;
 	unsigned char parent[20];
 	struct commit_list **pptr;
-	struct commit_graft *graft;
+	struct commit_graft *graft = NULL;
 	unsigned n_refs = 0;
 
 	if (item->object.parsed)
@@ -260,7 +261,8 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
 	bufptr += 46; /* "tree " + "hex sha1" + "\n" */
 	pptr = &item->parents;
 
-	graft = lookup_commit_graft(item->object.sha1);
+	if (use_grafts)
+		graft = lookup_commit_graft(item->object.sha1);
 	while (bufptr + 48 < tail && !memcmp(bufptr, "parent ", 7)) {
 		struct commit *new_parent;
 
diff --git a/commit.h b/commit.h
index 2d94d41..3e30aa0 100644
--- a/commit.h
+++ b/commit.h
@@ -22,6 +22,7 @@ struct commit {
 };
 
 extern int save_commit_buffer;
+extern int use_grafts;
 extern const char *commit_type;
 
 /* While we can decorate any object with a name, it's only used for commits.. */
diff --git a/git-filter-branch.sh b/git-filter-branch.sh
index d04c346..5ebe7cd 100755
--- a/git-filter-branch.sh
+++ b/git-filter-branch.sh
@@ -230,11 +230,11 @@ mkdir ../map || die "Could not create map/ directory"
 case "$filter_subdir" in
 "")
 	git rev-list --reverse --topo-order --default HEAD \
-		--parents "$@"
+		--follow-grafts --parents "$@"
 	;;
 *)
 	git rev-list --reverse --topo-order --default HEAD \
-		--parents "$@" -- "$filter_subdir"
+		--follow-grafts --parents "$@" -- "$filter_subdir"
 esac > ../revs || die "Could not get the commits"
 commits=$(wc -l <../revs | tr -d " ")
 
diff --git a/revision.c b/revision.c
index 5a1a948..ca98815 100644
--- a/revision.c
+++ b/revision.c
@@ -1059,6 +1059,10 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, const ch
 				revs->first_parent_only = 1;
 				continue;
 			}
+			if (!strcmp(arg, "--follow-grafts")) {
+				use_grafts = 1;
+				continue;
+			}
 			if (!strcmp(arg, "--reflog")) {
 				handle_reflog(revs, flags);
 				continue;
-- 
1.5.6.rc2.128.gf64ae


Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: To graft or not to graft... (Re: Recovering from repository corruption)
  2008-06-12  7:14                   ` Johan Herland
@ 2008-06-12  7:47                     ` Jeff King
  2008-06-12 10:21                       ` Johan Herland
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff King @ 2008-06-12  7:47 UTC (permalink / raw)
  To: Johan Herland; +Cc: git, Linus Torvalds, Stephen R. van den Berg, Denis Bueno

On Thu, Jun 12, 2008 at 09:14:21AM +0200, Johan Herland wrote:

> > The grafts file isn't part of the object stream and refs, and clones (and
> > fetches) very much just copy the object database.
> 
> AFAICS, there's already a perfectly fine way to distribute grafted history:
> 1. Add a grafts file
> 2. Run git-filter-branch
> 3. Remove grafts file
> 4. Distribute repo
> 5. Profit!
> 
> Since git-filter-branch turns grafted parentage into _real_ parentage,
> there's no point in ever having a grafts file at all (except transiently
> for telling git-filter-branch what to do).

But then you have rewritten all of the later commits, so you can no
longer talk to other people about them.

The kernel repo is split into "historical" and active repos. You can
graft the historical repo and get more far-reaching answers to things
like "git log" and "git blame". But if you run filter-branch, you can't
share development on that repo via push / pull to people who _don't_ use
the graft, since they don't share your history (and they probably don't
want to, because of the extra resources required to pull in the
historical chunk).

That being said, I don't know how common such a setup is. And you did
mention a "follow-grafts" config option for such people.

-Peff

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: To graft or not to graft... (Re: Recovering from repository corruption)
  2008-06-12  7:47                     ` Jeff King
@ 2008-06-12 10:21                       ` Johan Herland
  2008-06-12 12:20                         ` Stephen R. van den Berg
  0 siblings, 1 reply; 31+ messages in thread
From: Johan Herland @ 2008-06-12 10:21 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Linus Torvalds, Stephen R. van den Berg, Denis Bueno

On Thursday 12 June 2008, Jeff King wrote:
> On Thu, Jun 12, 2008 at 09:14:21AM +0200, Johan Herland wrote:
> > > The grafts file isn't part of the object stream and refs, and
> > > clones (and fetches) very much just copy the object database.
> >
> > AFAICS, there's already a perfectly fine way to distribute grafted
> > history: 1. Add a grafts file
> > 2. Run git-filter-branch
> > 3. Remove grafts file
> > 4. Distribute repo
> > 5. Profit!
> >
> > Since git-filter-branch turns grafted parentage into _real_
> > parentage, there's no point in ever having a grafts file at all
> > (except transiently for telling git-filter-branch what to do).
>
> But then you have rewritten all of the later commits, so you can no
> longer talk to other people about them.

Correct. My point is that if you want to talk to people about revisions, 
you'd better do it from a repo where people agree on the entire 
history. On the other hand, if you want to do archaeology with grafts, 
you should be aware that you are subverting one of the core guarantees 
provided by Git (i.e. a commit id verifies full ancestry of a commit), 
and therefore shouldn't communicate with other repos _at_ _all_, as 
other repos can easily be confused (see [1]).

> The kernel repo is split into "historical" and active repos. You can
> graft the historical repo and get more far-reaching answers to things
> like "git log" and "git blame". But if you run filter-branch, you
> can't share development on that repo via push / pull to people who
> _don't_ use the graft, since they don't share your history (and they
> probably don't want to, because of the extra resources required to
> pull in the historical chunk).

Yes, by forcing git-filter-branch, you can no longer push/pull to/from 
such a historical repo. But as this thread has already demonstrated, 
with grafts you can't clone from such a repo today (nor pull in certain 
circumstances, see [1]); so the way I see it, communication with this 
repo is _already_ limited. By disallowing grafts and forcing a rewrite 
of the entire repo, we force these communication problems to be more 
explicit/visible.

> That being said, I don't know how common such a setup is. And you did
> mention a "follow-grafts" config option for such people.

Indeed. :)

AFAICS, there's two use cases for grafts:
1. As a preparation for rewriting the history with git-filter-branch.
2. For providing historical repos (like you mention above).

My suggestion only makes life harder for people in the second use case.
If there are many people in the second use case, and they deem 
the "follow-grafts" config option unacceptable, I expect them to flame 
my suggestion to a crisp, and we'll have to think of something else...

Have fun! :)

...Johan

[1]: Consider the following:

### Create a repo with one commit, A
$ mkdir foo
$ cd foo
$ git init
Initialized empty Git repository in /path/to/foo/.git/
$ echo foo > foo
$ git add foo
$ git commit -mA
Created initial commit fe2ec02: A
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 foo
### Clone the repo
$ cd ..
$ git clone /path/to/foo bar
Initialize bar/.git
Initialized empty Git repository in /path/to/bar/.git/
### Create 3 more commits in the original repo: A---B---C---D
$ cd foo
$ echo bar >> foo && git commit -a -mB
Created commit ad10f00: B
 1 files changed, 1 insertions(+), 0 deletions(-)
$ echo baz >> foo && git commit -a -mC
Created commit be96559: C
 1 files changed, 1 insertions(+), 0 deletions(-)
$ echo xyzzy >> foo && git commit -a -mD
Created commit f2bafe5: D
 1 files changed, 1 insertions(+), 0 deletions(-)
### Create a graft removing C from the history: A---B---D
$ echo "f2bafe58175e132077285e7fbbcec30859101d2e \ 
ad10f005205f61429dccda95e1442dabe31fbfbe" > .git/info/grafts
### Pull the recent changes into the clone
$ cd ../bar
$ git pull
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (2/2), done.
Unpacking objects: 100% (6/6), done.
remote: Total 6 (delta 0), reused 0 (delta 0)
error: Could not read be965599d99192f624b8d8bbf3cab412872586fc
From /path/to/foo/
 + fe2ec02...f2bafe5 master     -> origin/master  (forced update)
error: Could not read be965599d99192f624b8d8bbf3cab412872586fc
error: Could not read be965599d99192f624b8d8bbf3cab412872586fc
Auto-merged foo
CONFLICT (add/add): Merge conflict in foo
Automatic merge failed; fix conflicts and then commit the result.

AFAICS, git-pull can easily become just as confused by grafts as 
git-clone. I wouldn't be surprised by a similar example for git-push.

I can only draw the conclusion that with current versions of Git, repos 
with grafts should _never_ be made public.

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: To graft or not to graft... (Re: Recovering from repository corruption)
  2008-06-12 10:21                       ` Johan Herland
@ 2008-06-12 12:20                         ` Stephen R. van den Berg
  0 siblings, 0 replies; 31+ messages in thread
From: Stephen R. van den Berg @ 2008-06-12 12:20 UTC (permalink / raw)
  To: Johan Herland; +Cc: Jeff King, git, Linus Torvalds, Denis Bueno

Johan Herland wrote:
>I can only draw the conclusion that with current versions of Git, repos 
>with grafts should _never_ be made public.

Correct.

I still prefer my original suggestion, i.e. allow repos with grafts to
be cloned, yet disregard the grafts during the cloning process.
The trouble is that with your suggestion, it becomes a bit convoluted
when grafts are being used and when not.  It already is complicated as
it is, so I suggest we try and keep git honest so that it does exactly
what one would expect (instead of documenting awkward behaviour).

As soon as time permits, I'll submit appropriate patches to implement
this, as well as some other sanity check patches which I've been
contemplating to help the grafter detect "bad" grafts as early as
possible.
-- 
Sincerely,
           Stephen R. van den Berg.

"Always look on the bright side of life!"

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 17:26 Recovering from repository corruption Denis Bueno
  2008-06-10 17:55 ` Jakub Narebski
@ 2008-06-10 19:40 ` Nicolas Pitre
  2008-06-10 19:42   ` Denis Bueno
  1 sibling, 1 reply; 31+ messages in thread
From: Nicolas Pitre @ 2008-06-10 19:40 UTC (permalink / raw)
  To: Denis Bueno; +Cc: Git Mailing List

On Tue, 10 Jun 2008, Denis Bueno wrote:

> I started a thread a while back about repository corruption.  It
> manifested as a clone error and the thread is here:
> 
>     http://kerneltrap.org/mailarchive/git/2007/7/31/253475
> 
> I just ran, again, into corruption after my laptop kernel-panic'd.
> (Ironically, at the moment I ran into the corruption I was trying to
> push my repo to a backup location.)  Since that thread took place it
> seems a section about recovering from repo corruption was added to the
> manual --- but it assumes you can (or care to painstakingly) recreate
> each corrupted version.

Would you happen, by chance, to have another instance of that repository 
somewhere else with the concerned objects in it?


Nicolas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Recovering from repository corruption
  2008-06-10 19:40 ` Recovering from repository corruption Nicolas Pitre
@ 2008-06-10 19:42   ` Denis Bueno
  0 siblings, 0 replies; 31+ messages in thread
From: Denis Bueno @ 2008-06-10 19:42 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Git Mailing List

On Tue, Jun 10, 2008 at 15:40, Nicolas Pitre <nico@cam.org> wrote:
>> (Ironically, at the moment I ran into the corruption I was trying to
>> push my repo to a backup location.)
>
> Would you happen, by chance, to have another instance of that repository
> somewhere else with the concerned objects in it?

Nope.  I was *just* about to back it up.

-- 
                              Denis

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2008-06-12 12:21 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-10 17:26 Recovering from repository corruption Denis Bueno
2008-06-10 17:55 ` Jakub Narebski
2008-06-10 19:38   ` Denis Bueno
2008-06-10 19:59     ` Jakub Narebski
2008-06-10 20:03       ` Denis Bueno
2008-06-10 20:14         ` Jakub Narebski
2008-06-10 20:35           ` Denis Bueno
2008-06-10 20:23         ` Linus Torvalds
2008-06-10 20:28           ` Denis Bueno
2008-06-10 21:09             ` Linus Torvalds
2008-06-10 21:22               ` Denis Bueno
2008-06-10 21:48                 ` Linus Torvalds
2008-06-10 22:09                   ` Denis Bueno
2008-06-10 22:25                     ` Tarmigan
2008-06-10 22:41                       ` Denis Bueno
2008-06-10 22:45                     ` Linus Torvalds
2008-06-10 23:00                       ` Linus Torvalds
2008-06-11  0:43                       ` Nicolas Pitre
2008-06-11  1:39                         ` Linus Torvalds
2008-06-11  1:47                           ` Nicolas Pitre
2008-06-10 21:27               ` Denis Bueno
2008-06-10 22:52               ` Junio C Hamano
2008-06-11 23:21               ` To graft or not to graft... (Re: Recovering from repository corruption) Stephen R. van den Berg
2008-06-11 23:34                 ` Jakub Narebski
2008-06-11 23:39                 ` Linus Torvalds
2008-06-12  7:14                   ` Johan Herland
2008-06-12  7:47                     ` Jeff King
2008-06-12 10:21                       ` Johan Herland
2008-06-12 12:20                         ` Stephen R. van den Berg
2008-06-10 19:40 ` Recovering from repository corruption Nicolas Pitre
2008-06-10 19:42   ` Denis Bueno

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).