git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "R. Tyler Ballance" <tyler@slide.com>
To: "Jan Krüger" <jk@jk.gs>
Cc: Git ML <git@vger.kernel.org>
Subject: Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
Date: Tue, 06 Jan 2009 14:52:00 -0800	[thread overview]
Message-ID: <1231282320.8870.52.camel@starfruit> (raw)
In-Reply-To: <20081209093627.77039a1f@perceptron>

[-- Attachment #1: Type: text/plain, Size: 3858 bytes --]

On Tue, 2008-12-09 at 09:36 +0100, Jan Krüger wrote:
> For fixing a corrupted repository by using backup copies of individual
> files, allow write_sha1_file() to write loose files even if the object
> already exists in a pack file, but only if the existing entry is marked
> as corrupted.

I figured I'd reply to this again, since the issue cropped up again.

We started experiencing *large* numbers of corruptions like the ones
that started the thread (one developer was receiving them once or twice
a day) with v1.6.0.4

We went ahead and upgraded to a custom build of v1.6.1 with Jan's patch
(below) and the issues /seem/ to have resolved themselves. I'm not
certain whether Jan's patch was really responsible, or if there was
another issue that caused this to correct itself in v1.6.1. 

As it stands, I think it's safe to assume that given the frequency of
the occurances that they were not tied to a memory or disk error (or
other levels of the machine's stack would be suffering as well). The
only thing I can think of is that /some/ developers who've experienced
the issue are using Samba mount points and changing files in Mac OS X,
but using Git on the mounted share (i.e. TextMate changes a file hosted
on Samba, changes are committed in an SSH session on that machine), but
that doesn't account for everything.

If there was something else included in the v1.6.1 release please let me
know so I can back Jan's patch out.


Cheers


> 
> Signed-off-by: Jan Krüger <jk@jk.gs>
> ---
> 
> On IRC I talked to rtyler who had a corrupted pack file and plenty of
> object backups by way of cloned repositories. We decided to try
> extracting the corrupted objects from the other object database and
> injecting them into the broken repo as loose objects, but this failed
> because sha1_write_file() refuses to write loose objects that are
> already present in a pack file.
> 
> This patch expands the check to see if the pack entry has been marked
> as corrupted and, if so, allows writing a loose object with the same
> ID. Unfortunately, when Tyler tried a merge while using this patch,
> something we didn't manage to track down happened and now git doesn't
> consider the object corrupted anymore. I'm not sure enough that it
> wasn't caused by the patch to submit this patch without hesitation.
> 
> Apart from that, I think the change is not all too great since it makes
> write_sha1_file() walk the list of pack entries twice. That's a bit of
> a waste.
> 
> So those are the reasons why I wanted a few opinions first. Another
> reason is that there might be a way smarter method to fix this kind of
> problem, in which case I'd love hearing about it for future reference.
> 
>  sha1_file.c |    9 +++++----
>  1 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/sha1_file.c b/sha1_file.c
> index 6c0e251..17085cc 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -2373,14 +2373,17 @@ int write_sha1_file(void *buf, unsigned long len, const char *type, unsigned cha
>  	char hdr[32];
>  	int hdrlen;
>  
> -	/* Normally if we have it in the pack then we do not bother writing
> -	 * it out into .git/objects/??/?{38} file.
> -	 */
>  	write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen);
>  	if (returnsha1)
>  		hashcpy(returnsha1, sha1);
> -	if (has_sha1_file(sha1))
> -		return 0;
> +	/* Normally if we have it in the pack then we do not bother writing
> +	 * it out into .git/objects/??/?{38} file. We do, though, if there
> +	 * is no chance that we have an uncorrupted version of the object.
> +	 */
> +	if (has_sha1_file(sha1)) {
> +		if (has_loose_object(sha1) || !has_packed_and_bad(sha1))
> +			return 0;
> +	}
>  	return write_loose_object(sha1, hdr, hdrlen, buf, len, 0);
>  }
>  
-- 
-R. Tyler Ballance
Slide, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

  parent reply	other threads:[~2009-01-06 22:53 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-09  8:36 [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file Jan Krüger
2008-12-09  9:02 ` R. Tyler Ballance
2008-12-09 16:24 ` Shawn O. Pearce
2009-01-06 22:52 ` R. Tyler Ballance [this message]
2009-01-07  1:25   ` Nicolas Pitre
2009-01-07  1:39     ` R. Tyler Ballance
2009-01-07  2:09       ` Nicolas Pitre
2009-01-07  2:47         ` R. Tyler Ballance
2009-01-07  3:21           ` Nicolas Pitre
2009-01-07  4:54       ` Linus Torvalds
2009-01-07  7:41         ` R. Tyler Ballance
2009-01-07  8:16           ` Junio C Hamano
2009-01-07  8:32             ` R. Tyler Ballance
2009-01-07  9:42               ` Junio C Hamano
2009-01-07  9:05           ` R. Tyler Ballance
2009-01-07 15:31           ` Nicolas Pitre
2009-01-07 16:07           ` Linus Torvalds
2009-01-07 16:08             ` Linus Torvalds
2009-01-07 22:55             ` R. Tyler Ballance
2009-01-07 23:29               ` Linus Torvalds
2009-01-08  0:28                 ` Public repro case! " R. Tyler Ballance
2009-01-08  0:48                   ` Linus Torvalds
2009-01-08  0:57                     ` R. Tyler Ballance
2009-01-08  1:08                       ` Linus Torvalds
2009-01-08  1:29                         ` Linus Torvalds
2009-01-08  1:46                           ` Shawn O. Pearce
2009-01-08  2:21                     ` James Pickens
2009-01-08  2:43                       ` Shawn O. Pearce
2009-01-08  5:40                         ` Junio C Hamano
2009-01-08  6:04                           ` Shawn O. Pearce
2009-01-08  2:52                       ` Boyd Stephen Smith Jr.
2009-01-08  2:52                   ` Linus Torvalds
2009-01-08  3:01                     ` Shawn O. Pearce
2009-01-08  3:06                       ` Linus Torvalds
2009-01-08  3:13                         ` Shawn O. Pearce
2009-01-08  3:16                           ` [PATCH] Wrap inflateInit to retry allocation after releasing pack memory Shawn O. Pearce
2009-01-08  3:54                             ` Linus Torvalds
2009-01-08  5:23                               ` Junio C Hamano
2009-01-08 15:35                                 ` Linus Torvalds
2009-01-08 15:34                               ` Shawn O. Pearce
2009-01-08 16:14                                 ` Linus Torvalds
2009-01-08 18:15                               ` R. Tyler Ballance
2009-01-08 20:22                                 ` Linus Torvalds
2009-01-08 20:37                                   ` R. Tyler Ballance
2009-01-09  1:43                                   ` Junio C Hamano
2009-01-08  0:37                 ` [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file Linus Torvalds
2009-01-08  0:49                   ` R. Tyler Ballance
2009-01-08  1:01                     ` Linus Torvalds
2009-01-08  1:06                       ` R. Tyler Ballance

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1231282320.8870.52.camel@starfruit \
    --to=tyler@slide.com \
    --cc=git@vger.kernel.org \
    --cc=jk@jk.gs \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).