From: Siddharth Agarwal <sid0@fb.com>
To: <git@vger.kernel.org>
Subject: fetches with bitmaps enabled can cause accesses to already GC'd objects
Date: Tue, 25 Mar 2014 19:22:37 -0700 [thread overview]
Message-ID: <533239ED.5040503@fb.com> (raw)
Hi,
We're still experimenting with bitmaps, and we've have run into issues
where fetching from a repository with bitmaps enabled can lead to
objects that used to be present on the server but have since been GC'd
being accessed, and git pack-objects on the server failing because of that.
I can consistently reproduce this with a particular pair of repos, and
tip of git master (3f09db0) with no patches on top running on both ends.
git fetch fails with
remote: error: Could not read be7cbe440a7b9a34f53515af4075e971c811cfb2
remote: fatal: bad tree object be7cbe440a7b9a34f53515af4075e971c811cfb2
error: git upload-pack: git-pack-objects died with error.
fatal: git upload-pack: aborting due to possible repository corruption
on the remote side.
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header
Removing the bitmap fixes this.
be7cbe440a7b9a34f53515af4075e971c811cfb2 is a tree object that is
present on the client but not on the server. It used to be present on
the server, but the any refs that it was reachable from have been
removed and the object has since been garbage collected. One ref that
this object was reachable from and that used to be on the server is
still present on the client though, under refs/remotes/origin/.
This tree object seems to be reachable from exactly one other tree
object, and so on, until I reach a commit object. Note that the commit
and root tree pointing to be7cbe440a7b9a34f53515af4075e971c811cfb2 is
still present as a loose object in the repo.
I dug into this a bit, and it looks like the bad access is inside
https://github.com/git/git/blob/3f09db0/pack-bitmap.c#L730, and from
there inside https://github.com/git/git/blob/3f09db0/pack-bitmap.c#L575.
This ultimately calls traverse_commit_list at
https://github.com/git/git/blob/3f09db0/list-objects.c#L195, which adds
the tree that transitively points to
be7cbe440a7b9a34f53515af4075e971c811cfb2 as pending. (Note again that
the commit and root tree objects still exist in the repo as loose
objects.) Further down in that function, process_tree is called, which
traverses the tree and ultimately dies at
https://github.com/git/git/blob/3f09db0/list-objects.c#L85.
Unfortunately, as before, I can't share the repo this is happening in.
next reply other threads:[~2014-03-26 2:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-26 2:22 Siddharth Agarwal [this message]
2014-03-28 10:00 ` [PATCH] add `ignore_missing_links` mode to revwalk Jeff King
2014-03-31 21:48 ` Siddharth Agarwal
2014-04-01 7:54 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=533239ED.5040503@fb.com \
--to=sid0@fb.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).