git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Siddharth Agarwal <sid0@fb.com>
Cc: git@vger.kernel.org, bmaurer@fb.com, Aaron Kushner <akushner@fb.com>
Subject: Re: with reuse-delta patches, fetching with bitmaps segfaults due to possibly incomplete bitmap traverse
Date: Sat, 22 Mar 2014 08:56:27 -0400	[thread overview]
Message-ID: <20140322125626.GA22890@sigill.intra.peff.net> (raw)
In-Reply-To: <532CFC6F.8000008@fb.com>

On Fri, Mar 21, 2014 at 07:58:55PM -0700, Siddharth Agarwal wrote:

> At Facebook we've found that fetch speed is a bottleneck for our Git repos,
> so we've been looking to deploy bitmaps to speed up fetches. We've been
> trying out git-next with the top two patches from
> https://github.com/peff/git/commits/jk/bitmap-reuse-delta, but the following
> is reproducible with tip of that branch, currently 81cdec2.

Is it also reproducible just with the tip of "next"? Note that the
patches in jk/bitmap-reuse-delta have not been widely deployed (in
particular, we are not yet using them at GitHub, and we track segfaults
on our servers closely and have not seen any related to this).

Those patches allocate extra "fake" entries in the entry->delta fields,
which are not accounted for in to_pack.nr_objects. It's entirely
possible that those entries are related to the bug you are seeing.

> I dug into this a bit and it looks like at this point:
> 
> https://github.com/peff/git/blob/81cdec28fa24fdc613ab7c3406c1c67975dbf22f/builtin/pack-objects.c#L700
> 
> at some object that add_family_to_write_order is called for, wo_end exceeds
> to_pack.nr_objects by over 1000 objects. More precisely, at the point it
> crashes, wo_end is 218081 while to_pack.nr_objects is 201614. (This means
> wo_end overshot to_pack.nr_objects some time ago.)

Hmm, yeah, that confirms my suspicion. In the earlier loops, we call
add_to_write_order, which only adds the object in question, and can
never exceed to_pack.nr_objects. In this final loop, we call
add_family_to_write_order, which is going to add any deltas that were
not already included.

The patch below may fix your problem, but I have a feeling it is not the
right thing to do. The point of 81cdec28 is to try to point to a delta
entry as if it were a "preferred base" (i.e., something we know that the
other side has already). We perhaps want to add these entries to the
actual packing list, and skip them as we do with normal preferred_base
objects.

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 9fc5321..ca1b0f7 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1437,6 +1437,7 @@ static void check_object(struct object_entry *entry)
 			entry->delta = xcalloc(1, sizeof(*entry->delta));
 			hashcpy(entry->delta->idx.sha1, base_ref);
 			entry->delta->preferred_base = 1;
+			entry->delta->filled = 1;
 			unuse_pack(&w_curs);
 			return;
 		}

-Peff

  reply	other threads:[~2014-03-22 12:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-22  2:58 with reuse-delta patches, fetching with bitmaps segfaults due to possibly incomplete bitmap traverse Siddharth Agarwal
2014-03-22 12:56 ` Jeff King [this message]
2014-03-24  0:01   ` Siddharth Agarwal
2014-03-24 20:30   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140322125626.GA22890@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=akushner@fb.com \
    --cc=bmaurer@fb.com \
    --cc=git@vger.kernel.org \
    --cc=sid0@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).