From: "R. Tyler Ballance" <tyler@slide.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Nicolas Pitre" <nico@cam.org>, "Jan Krüger" <jk@jk.gs>,
"Git ML" <git@vger.kernel.org>
Subject: Re: [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file
Date: Tue, 06 Jan 2009 23:41:39 -0800 [thread overview]
Message-ID: <1231314099.8870.415.camel@starfruit> (raw)
In-Reply-To: <alpine.LFD.2.00.0901062026500.3057@localhost.localdomain>
[-- Attachment #1: Type: text/plain, Size: 5231 bytes --]
On Tue, 2009-01-06 at 20:54 -0800, Linus Torvalds wrote:
>
> On Tue, 6 Jan 2009, R. Tyler Ballance wrote:
> >
> > I'll back the patch out and redeploy, it's worth mentioning that a
> > coworker of mine just got the issue as well (on 1.6.1). He was able to
> > `git pull` and the error went away, but I doubt that it "magically fixed
> > itself"
>
> Quite frankly, that behaviour sounds like a disk _cache_ corruption issue.
> The fact that some corruption "comes and goes" and sometimes magically
> heals itself sounds very much like some disk cache problem, and then that
> particular part of the cache gets replaced and then when re-populated it
> is magically correct.
>
> We had that in one case with a Linux NFS client, where a rename across
> directories caused problems.
>
> This was a networked filesystem on OS X, right? File caching is much more
> "interesting" in networked filesystems than it is in normal private
> on-disk ones.
Not quite, what I meant was that some users (not all) who've experienced
this issue are using Samba to copy files over directly into the Git
repository. I was mentioning this in case somewhere between Finder,
Samba, ext3 and Git, some file system change events were pissing Git off
and causing it. I don't think this is the case as the coworker that I
mentioned earlier doesn't use Samba and neither do I (we both experience
the issue today, mine disappeared by upgrading to 1.6.1, his by `git
pull`).
>
> > I've tarred one of the repositories that had it in a reproducible state
> > so I can create a build and extract the tar and run against that to
> > verify any patches anybody might have, but unfortunately at 7GB of
> > company code and assets, I can't exactly share ;)
>
> The thing to do is
>
> - untar it on some trusted machine with a local disk and a known-good
> filesystem.
>
> IOW, not that networked samba share.
>
> - verify that it really does happen on that machine, with that untarred
> image. Because maybe it doesn't.
Unfortunately it doesn't, what I did notice was this when I did a `git
status` in the directory right after untarring:
tyler@grapefruit:~/jburgess_main> git status
#
# ---impressive amount of file names fly by---
# ----snip---
#
# Untracked files:
# (use "git add <file>..." to include in what will be
committed)
#
# artwork/
# bt/
# flash/
tyler@grapefruit:~/jburgess_main>
Basically, somehow Git thinks that *every* file in the repository is
deleted at this point. I went ahead and performed a `git reset --hard`
to see if the issue would manifest itself thereafter, but it did not.
I did try to do a git-fsck(1), and this is what I got:
tyler@grapefruit:~/jburgess_main> /usr/local/bin/git fsck --full
[1] 19381 segmentation fault /usr/local/bin/git fsck --full
tyler@grapefruit:~/jburgess_main>
>
> The hope is that you caught the corruption in the cache, and it
> actually got written out to the tar-file. But if it _is_ a disk cache
> (well, network cache) issue, maybe the IO required to tar everything up
> was enough to flush it, and the tar-file actually _works_ because it
> got repopulated correctly.
When I was working through this with Jan, one of the things that we did
was move the actual object file in .git/objects, they existed so maybe I
could look into those to check?
>
> So that's why you should double-check that it really ends up being
> corrupt after being untarred again.
>
> - go back and test the original git repo on the network share, preferably
> on another client. See if the error has gone away.
Unfortunately the repository is being used by the original developer I
tarred from with our 1.6.1 build, he hasn't reported any issues, but I
can't exactly steal it back (that's why I made the tar)
> The fact that you seem to get a _lot_ of these errors really does make
> it
> sound like something in your environment. It's actually really hard to get
> git to corrupt anything. Especially objects that got packed. They've been
> quiescent for a long time, they got repacked in a very simple way, they
> are totally read-only.
I checked with our operations team, and contrary to my suspicion (your
NFS comment piqued my curiosity), these disks that are actually on the
machines are not NFS mounts but rather local disk arrays.
--> is it NFSd? or all local storage
<== all local
<== df -h
<== mount
<== /dev/sda5 705G 247G 423G 37% /nail
--> hm, there goes that theory
<== git corruption?
--> yeah, looking into it
<== sucks
--> Linus had a theory about NFS/etc corruption of the disk
cache
<== when the company folds we can all blame you... and your
silly git games
<== (think positive, joel)
--> thanks
;)
Any thing else I can do to help debug this? :-/
Cheers
--
-R. Tyler Ballance
Slide, Inc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
next prev parent reply other threads:[~2009-01-07 7:43 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-09 8:36 [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file Jan Krüger
2008-12-09 9:02 ` R. Tyler Ballance
2008-12-09 16:24 ` Shawn O. Pearce
2009-01-06 22:52 ` R. Tyler Ballance
2009-01-07 1:25 ` Nicolas Pitre
2009-01-07 1:39 ` R. Tyler Ballance
2009-01-07 2:09 ` Nicolas Pitre
2009-01-07 2:47 ` R. Tyler Ballance
2009-01-07 3:21 ` Nicolas Pitre
2009-01-07 4:54 ` Linus Torvalds
2009-01-07 7:41 ` R. Tyler Ballance [this message]
2009-01-07 8:16 ` Junio C Hamano
2009-01-07 8:32 ` R. Tyler Ballance
2009-01-07 9:42 ` Junio C Hamano
2009-01-07 9:05 ` R. Tyler Ballance
2009-01-07 15:31 ` Nicolas Pitre
2009-01-07 16:07 ` Linus Torvalds
2009-01-07 16:08 ` Linus Torvalds
2009-01-07 22:55 ` R. Tyler Ballance
2009-01-07 23:29 ` Linus Torvalds
2009-01-08 0:28 ` Public repro case! " R. Tyler Ballance
2009-01-08 0:48 ` Linus Torvalds
2009-01-08 0:57 ` R. Tyler Ballance
2009-01-08 1:08 ` Linus Torvalds
2009-01-08 1:29 ` Linus Torvalds
2009-01-08 1:46 ` Shawn O. Pearce
2009-01-08 2:21 ` James Pickens
2009-01-08 2:43 ` Shawn O. Pearce
2009-01-08 5:40 ` Junio C Hamano
2009-01-08 6:04 ` Shawn O. Pearce
2009-01-08 2:52 ` Boyd Stephen Smith Jr.
2009-01-08 2:52 ` Linus Torvalds
2009-01-08 3:01 ` Shawn O. Pearce
2009-01-08 3:06 ` Linus Torvalds
2009-01-08 3:13 ` Shawn O. Pearce
2009-01-08 3:16 ` [PATCH] Wrap inflateInit to retry allocation after releasing pack memory Shawn O. Pearce
2009-01-08 3:54 ` Linus Torvalds
2009-01-08 5:23 ` Junio C Hamano
2009-01-08 15:35 ` Linus Torvalds
2009-01-08 15:34 ` Shawn O. Pearce
2009-01-08 16:14 ` Linus Torvalds
2009-01-08 18:15 ` R. Tyler Ballance
2009-01-08 20:22 ` Linus Torvalds
2009-01-08 20:37 ` R. Tyler Ballance
2009-01-09 1:43 ` Junio C Hamano
2009-01-08 0:37 ` [PATCH/RFC] Allow writing loose objects that are corrupted in a pack file Linus Torvalds
2009-01-08 0:49 ` R. Tyler Ballance
2009-01-08 1:01 ` Linus Torvalds
2009-01-08 1:06 ` R. Tyler Ballance
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1231314099.8870.415.camel@starfruit \
--to=tyler@slide.com \
--cc=git@vger.kernel.org \
--cc=jk@jk.gs \
--cc=nico@cam.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).