From: Jeff King <peff@peff.net>
To: Earl Gresh <egresh@codeaurora.org>
Cc: git@vger.kernel.org
Subject: Re: Missing Refs after Garbage Collection
Date: Sat, 22 Dec 2012 20:04:27 -0500 [thread overview]
Message-ID: <20121223010427.GA2878@sigill.intra.peff.net> (raw)
In-Reply-To: <C0A16EC8-D05A-41D0-BF2A-34BF3B1B839E@codeaurora.org>
On Fri, Dec 21, 2012 at 05:41:43PM -0800, Earl Gresh wrote:
> I have observed that after running GC, one particular git repository
> ended up with some missing refs in the refs/changes/* namespace the
> Gerrit uses for storing patch sets. The refs were valid and should not
> have been pruned. Concerned about loosing data, GC is still enabled
> but ref packing is turned off. Now the number of refs has grown to the
> point that it's causing performance problems when cloning the project.
>
> Is anyone familiar with git gc deleting valid references? I'm running
> git version 1.7.8. Have there been any patches in later git releases
> that might address this issue ( if it is a git problem )?
I have never seen deletion, but I did recently find a race condition
with ref packing that caused rewinds, where:
1. Two processes simultaneously repack the refs.
2. At least one process is using an "old" version of the pack-refs
file. That is, it cached the packed refs list earlier in the
process and is now rewriting it based on that cached notion.
3. The first process takes the lock, packs refs, drops the
lock, and then deletes the loose versions. The simultaneous packer
then takes the lock, overwrites the packed-refs file with a stale
copy from its memory, and then releases the lock. We're left with
the stale copy in pack-refs, and deleted loose refs.
In my case, it looked like a rewind, because the stale, memory-cached
refs had the old version. But if you have a ref which was not previously
packed, it would appear to have been deleted.
The tricky thing about triggering this race is that step (2) needs a
process which has previously read and cached the packed-refs, and then
decided to pack the refs. The "git pack-refs" command does not do this,
because it starts, packs the ref, and exists. But processes which delete
a ref need to rewrite the packed-refs file (omitting the deleted ref),
and depending on the process, may have previously read and cached the
packed refs file. The obvious candidate is "receive-pack".
So this may be your culprit if:
1. This is a repo people are pushing into via C git.
2. You simultaneously run "git pack-refs" (or "git gc") while people
may be pushing.
You mentioned Gerrit, so I wonder if people are actually pushing via C
git (I thought it used JGit entirely). Or perhaps JGit has the same bug.
My fix (which is not yet released in any git version) is here:
http://article.gmane.org/gmane.comp.version-control.git/211956
-Peff
next prev parent reply other threads:[~2012-12-23 1:14 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-22 1:41 Missing Refs after Garbage Collection Earl Gresh
2012-12-22 22:26 ` Dmitry Potapov
2012-12-23 1:04 ` Jeff King [this message]
2013-01-02 22:43 ` Martin Fick
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121223010427.GA2878@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=egresh@codeaurora.org \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).