git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Siddharth Agarwal <sid0@fb.com>
To: <git@vger.kernel.org>
Subject: fetches with bitmaps enabled can cause accesses to already GC'd objects
Date: Tue, 25 Mar 2014 19:22:37 -0700	[thread overview]
Message-ID: <533239ED.5040503@fb.com> (raw)

Hi,

We're still experimenting with bitmaps, and we've have run into issues 
where fetching from a repository with bitmaps enabled can lead to 
objects that used to be present on the server but have since been GC'd 
being accessed, and git pack-objects on the server failing because of that.

I can consistently reproduce this with a particular pair of repos, and 
tip of git master (3f09db0) with no patches on top running on both ends. 
git fetch fails with

remote: error: Could not read be7cbe440a7b9a34f53515af4075e971c811cfb2
remote: fatal: bad tree object be7cbe440a7b9a34f53515af4075e971c811cfb2
error: git upload-pack: git-pack-objects died with error.
fatal: git upload-pack: aborting due to possible repository corruption 
on the remote side.
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header

Removing the bitmap fixes this.

be7cbe440a7b9a34f53515af4075e971c811cfb2 is a tree object that is 
present on the client but not on the server. It used to be present on 
the server, but the any refs that it was reachable from have been 
removed and the object has since been garbage collected. One ref that 
this object was reachable from and that used to be on the server is 
still present on the client though, under refs/remotes/origin/.

This tree object seems to be reachable from exactly one other tree 
object, and so on, until I reach a commit object. Note that the commit 
and root tree pointing to be7cbe440a7b9a34f53515af4075e971c811cfb2 is 
still present as a loose object in the repo.

I dug into this a bit, and it looks like the bad access is inside 
https://github.com/git/git/blob/3f09db0/pack-bitmap.c#L730, and from 
there inside https://github.com/git/git/blob/3f09db0/pack-bitmap.c#L575. 
This ultimately calls traverse_commit_list at 
https://github.com/git/git/blob/3f09db0/list-objects.c#L195, which adds 
the tree that transitively points to 
be7cbe440a7b9a34f53515af4075e971c811cfb2 as pending. (Note again that 
the commit and root tree objects still exist in the repo as loose 
objects.) Further down in that function, process_tree is called, which 
traverses the tree and ultimately dies at 
https://github.com/git/git/blob/3f09db0/list-objects.c#L85.

Unfortunately, as before, I can't share the repo this is happening in.

             reply	other threads:[~2014-03-26  2:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-26  2:22 Siddharth Agarwal [this message]
2014-03-28 10:00 ` [PATCH] add `ignore_missing_links` mode to revwalk Jeff King
2014-03-31 21:48   ` Siddharth Agarwal
2014-04-01  7:54     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=533239ED.5040503@fb.com \
    --to=sid0@fb.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).