git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergio Callegari <scallegari@arces.unibo.it>
To: git@vger.kernel.org
Subject: Re: Problematic git pack
Date: Thu, 31 Aug 2006 10:45:12 +0200	[thread overview]
Message-ID: <44F6A198.4040902@arces.unibo.it> (raw)

What can I say... I had never seen before such an action at such a rapid 
pace following the indication of a potential problem.
Thanks Linus and Junio and everybody who might have contributed.
>   Junio could then generate a new pack with the one corrupted object 
>   fixed, which obviously meant that all the deltas now worked too.
>   
Excellent news...
>   This is my (probably final) analysis of the resulting differences.. ]
>
> On Wed, 30 Aug 2006, Junio C Hamano wrote:
> > 
> > Ok, I was going to attach the resurrected pack that should
> > contain everything your corrupt pack had, but it is a bit too
> > large, so I'll place it here [*1*].  Drop me a note when you
> > retrieved it, so that I can remove it.
>   
Junio, can you please send me privately details about [*1*] so I can 
retrieve the pack also?

I also have another question... (maybe it was answered in some previous 
thread on this list, in this case a pointer would be enough).
Now I am going to have the fixed archive and also a new archive, which I 
restarted from the latest working copy I had of my project.
Is there any way to automatically do real "surgery" to attach one to the 
other and get a single archive with all the history?
Obviously, if I try to change a commit object to modify its parents, its 
signature changes, so I need to modify its childs and so on, is this 
correct?
Alternatively I belive that grafts should be a way to go... I had never 
used them before, do all git tools support them? Particularly do they 
get pushed and pulled correctly?
> So the _real_ difference is literally just the one byte at offset 0151000 
> (decimal 53760) which in the fixed pack is 0x96, and in the corrupt pack 
> it is 0x94. That's a single-bit difference (bit #1 has been cleared).
>
>   
So, possibly, the alpha particle theory could be the plausible one in 
the end...
> Now, that makes me feel happy on one level, because it's almost certainly 
> a hardware problem - subtle memory corruption, or disk corruption that 
> happened when either reading or writing the image. Sergio may not be that 
> happy about it, of course.
>   
The bad thing is that I don't know which of my two machines (the laptop 
or the desktop) caused the issue!

> Finally, this also points out that the corrupted packs _can_ be fixed, but 
> I think Sergio was a bit lucky (to offset all the bad luck). Sergio still 
> had access to the original file that had had its object corrupted. 
Actually, this could possibly be a not so rare case... In my tree I had 
the development of some LaTeX documents and packages (code like, the 
really "precious" files) and a few binary objects (images and openoffice 
files mainly, by far less precious).
Since the binary objects were so much overwhelming in size with regard 
to the text ones, assuming a single error the probability of having it 
in a non-code object was much larger than that of having it in a 
precious code object. Also commit and tree objects should be much 
smaller than data objects.
This assumption is the reason which initally pushed me to ask help to 
try to unpack at least all the correct objects (one of my first 
questions was: does git unpack-objects die on the first error or is 
there a way to convince it to simply skip the wrong object (or the delta 
against a wrong object)...
If git unpack-objects can gain an option like --continue-on-errors and 
if checkout/reset can also get an option to do the same (i.e. in a tree 
with missing objects, checkout all that can be found), I believe that 
one is at a good point already...
Finally, having a command to create an object out of a single file 
(contrary of git cat-file) could help re-creating the missing objects...
> And it 
> took a fair amount of work, and some git hacking by somebody who really 
> understood git (Junio).
>
> Maybe we'll end up having some of that effort being useful and checked in, 
> and we'll eventually have more infrastructure for fixing these things, but 
> I suspect that in most cases, even a _single_ bit of corruption will 
> generally result in so much havoc that nobody should depend on that. It's 
> a lot better to have backups.
>
> 			Linus

             reply	other threads:[~2006-08-31  8:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-31  8:45 Sergio Callegari [this message]
2006-08-31 11:15 ` Problematic git pack Johannes Schindelin
2006-08-31 16:23 ` Nicolas Pitre
2006-08-31 21:33 ` Linus Torvalds
     [not found] <44F1D826.2010701@arces.unibo.it>
     [not found] ` <7v1wr1yjjz.fsf@assigned-by-dhcp.cox.net>
     [not found]   ` <44F4006C.1040908@arces.unibo.it>
     [not found]     ` <7vmz9nn90t.fsf@assigned-by-dhcp.cox.net>
     [not found]       ` <Pine.LNX.4.64.0608291007170.27779@g5.osdl.org>
     [not found]         ` <7vodu2iryg.fsf@assigned-by-dhcp.cox.net>
     [not found]           ` <44F5615F.7010809@arces.unibo.it>
     [not found]             ` <7v7j0qihwl.fsf@assigned-by-dhcp.cox.net>
2006-08-30 18:11               ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44F6A198.4040902@arces.unibo.it \
    --to=scallegari@arces.unibo.it \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).