From: "R. Tyler Ballance" <tyler@slide.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Nicolas Pitre" <nico@cam.org>, "Jan Krüger" <jk@jk.gs>,
"Git ML" <git@vger.kernel.org>
Subject: Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
Date: Wed, 07 Jan 2009 14:55:35 -0800 [thread overview]
Message-ID: <1231368935.8870.584.camel@starfruit> (raw)
In-Reply-To: <alpine.LFD.2.00.0901070743070.3057@localhost.localdomain>
[-- Attachment #1: Type: text/plain, Size: 4576 bytes --]
On Wed, 2009-01-07 at 08:07 -0800, Linus Torvalds wrote:
> Well, that's not necessarily "unfortunate". It does actually end up
> showing that the objects themselves were apparently never really corrupt.
>
> So there is no fundamental data structure corrupttion - because when you
> copy the repository, it's all good agin!
> - it could be some _temporary_ git corruption caused internally inside a
> git process - ie a wild pointer, or perhaps a race condition (but we
> don't really use threading in 1.6.0.4 unless you ask for it, and even
> then just for pack-file generation)
I have a feeling it's something like this, one of our operations guys
did some research while I was looking at code and he came across this:
On Wed, 2009-01-07 at 14:17 -0800, Ken Brownfield wrote:
git-merge is using too much RAM, and failing to malloc() but
NOT
> reporting it. This is all sorts of bad:
>
> A) using an unscalable amount of RAM
> B) failing to detect malloc() failure
> C) reporting file corruption instead
> I was able to reproduce this.
>
> limit ~1.5GB -> corrupt file
> limit ~3GB -> magically no longer corrupt.
>
> The false fail may be limited to git-merge, but git status also
> allocates the same amount of RAM.
>
> To temporarily work around this problem, issue this once you
log in to
> a dev box:
>
> tcsh:
> limit vmemoryuse 3000000
> bash:
> ulimit -v 3000000
>
> Be gentle.
> And quite frankly, since the corruption seems to be site-specific, I
> really do suspect the second case. Although it's possible, of course, that
> it could be some compiler issue that makes _your_ binaries have issues
> even when nobody else sees it.
I think you're correct insofar that our major site-specific alteration
has come up on the mailing list before (okay maybe two site-specific
things).
* Our Git repo is ~7.1GB
* ulimit -v is set to ~1.5G
I think I know how this could be failing and corrupting things (assuming
it's malloc(2)) related.
What I'm thinking is that in xmalloc() or one of the other x*)_
functions, the malloc(size) is failing because of the ulimits, and then
the potentially somewhere it's silently failing or maybe even
accidentally returning one of those "malloc(1)" pointers?
I've got two new tarred repositories from two developers the issue
happened to today, so I'm flush full of sample repositories to try stuff
on :)
>
> Hmm. That's actually _normal_ under some circumstances. At least with
> older git versions, or if your .git/index file couldn't be rewritten for
> some reason - your existing index file contains all the old stat
> information, and if git cannot (or, in the case of older git version, just
> will not) refresh it automatically, it will show all the files as changed,
> even if it's just the inode number that really changed.
>
> A _normal_ git install should have auto-refreshed the index, though.
> Unless the tar archive only contained the ".git" directory, and not the
> checkout?
I believe the issues I noticed when untarring the repo were a red
herring, I did the `git diff` after untarring and I noticed that only a
certain set of files where changed, I'm willing to go so far as to guess
that they were the files affected in the corrupted packs. Of the 32k
files in our repository, 98 were actually different after untarring
(according to git-diff(1))
> And dobody else saw it than this one person, and it was a total mystery to
> everybody until we realized that he used this one feature that nobody else
> was using. So as you're on OS X, I assume you don't have CRLF conversion,
> but maybe you use some other feature that we support but nobody really
> actually uses. Like keyword expansion or something?
The two new folks this happened to today had nothing "special" about
them other than the ulimit.
I've got the script(1) output of performing git-ls-files(1) and some
other commands that I tried, nothing they output was particular
informative or interesting, and I don't think it will help if this
really is a memory related issue, that said I'd be more than happy to
send it to a couple of you (Junio, Linus, Nico).
I'm *so* ready for this bug to die >=\
Cheers
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
next prev parent reply other threads:[~2009-01-07 22:57 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-09 8:36 [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file Jan Krüger
2008-12-09 9:02 ` R. Tyler Ballance
2008-12-09 16:24 ` Shawn O. Pearce
2009-01-06 22:52 ` R. Tyler Ballance
2009-01-07 1:25 ` Nicolas Pitre
2009-01-07 1:39 ` R. Tyler Ballance
2009-01-07 2:09 ` Nicolas Pitre
2009-01-07 2:47 ` R. Tyler Ballance
2009-01-07 3:21 ` Nicolas Pitre
2009-01-07 4:54 ` Linus Torvalds
2009-01-07 7:41 ` R. Tyler Ballance
2009-01-07 8:16 ` Junio C Hamano
2009-01-07 8:32 ` R. Tyler Ballance
2009-01-07 9:42 ` Junio C Hamano
2009-01-07 9:05 ` R. Tyler Ballance
2009-01-07 15:31 ` Nicolas Pitre
2009-01-07 16:07 ` Linus Torvalds
2009-01-07 16:08 ` Linus Torvalds
2009-01-07 22:55 ` R. Tyler Ballance [this message]
2009-01-07 23:29 ` Linus Torvalds
2009-01-08 0:28 ` Public repro case! " R. Tyler Ballance
2009-01-08 0:48 ` Linus Torvalds
2009-01-08 0:57 ` R. Tyler Ballance
2009-01-08 1:08 ` Linus Torvalds
2009-01-08 1:29 ` Linus Torvalds
2009-01-08 1:46 ` Shawn O. Pearce
2009-01-08 2:21 ` James Pickens
2009-01-08 2:43 ` Shawn O. Pearce
2009-01-08 5:40 ` Junio C Hamano
2009-01-08 6:04 ` Shawn O. Pearce
2009-01-08 2:52 ` Boyd Stephen Smith Jr.
2009-01-08 2:52 ` Linus Torvalds
2009-01-08 3:01 ` Shawn O. Pearce
2009-01-08 3:06 ` Linus Torvalds
2009-01-08 3:13 ` Shawn O. Pearce
2009-01-08 3:16 ` [PATCH] Wrap inflateInit to retry allocation after releasing pack memory Shawn O. Pearce
2009-01-08 3:54 ` Linus Torvalds
2009-01-08 5:23 ` Junio C Hamano
2009-01-08 15:35 ` Linus Torvalds
2009-01-08 15:34 ` Shawn O. Pearce
2009-01-08 16:14 ` Linus Torvalds
2009-01-08 18:15 ` R. Tyler Ballance
2009-01-08 20:22 ` Linus Torvalds
2009-01-08 20:37 ` R. Tyler Ballance
2009-01-09 1:43 ` Junio C Hamano
2009-01-08 0:37 ` [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file Linus Torvalds
2009-01-08 0:49 ` R. Tyler Ballance
2009-01-08 1:01 ` Linus Torvalds
2009-01-08 1:06 ` R. Tyler Ballance
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1231368935.8870.584.camel@starfruit \
--to=tyler@slide.com \
--cc=git@vger.kernel.org \
--cc=jk@jk.gs \
--cc=nico@cam.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).