All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1
Date: Wed, 22 Nov 2017 17:36:14 -0500	[thread overview]
Message-ID: <20171122223613.GA1405@sigill> (raw)
In-Reply-To: <xmqqwp2jxf5l.fsf@gitster.mtv.corp.google.com>

On Wed, Nov 22, 2017 at 10:42:30AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I'm not sure what the right behavior is, but I'm pretty sure that's not
> > it. Probably one of:
> >
> >   - skip updating the ref when we see the breakage
> >
> >   - ditto, but terminate the whole operation, since we might be deleting
> >     other refs and in a broken repo we're probably best to make as few
> >     changes as possible
> >
> >   - behave as if it was a non-ff, which would allow "--force" to
> >     overwrite the broken ref. Maybe convenient for fixing things, but
> >     possibly surprising (and it's not that hard to just delete the
> >     broken refs manually before proceeding).
> 
> Perhaps the last one would be the ideal endgame, but the second one
> may be a good stopping point in the shorter term.

This turns out to be a lot trickier than I expected. The crux of the
matter is that the case we care about is hidden inside
lookup_commit_reference_gently(), which doesn't distinguish between
corruption and "not a commit".

So there are four cases we care about for this call in fetch:

  1. We fed a real sha1 and got a commit (or peeled to one).

  2. We fed a real sha1 which resolved to a non-commit, and we got NULL.

  3. We fed a real sha1 and the object was missing or corrupted, and we
     got NULL.

  4. We fed a null sha1 and got NULL.

Right now we lump cases 2-4 together as "do not do a fast-forward
check". That's fine for 2 and 4, but probably not for 3. We can easily
catch case 4 ourselves (if we care to), but distinguishing case 3 from
the others is hard. How should lookup_commit_reference_gently() signal
it to us?

Or should lookup_commit_reference_gently() die on corruption? That's not
very "gentle", but I think the "gently" here is really about "it might
not be a commit", not "the repo might be corrupted". But I think even
that may be the tip of the iceberg. The next thing we do is feed the
commits to in_merge_bases(), which will happily return "nope" if the old
commit cannot be parsed (because it has only a boolean return value).

So I dunno. Maybe it is a losing battle to try to pass this kind of
corruption information up the stack.  I'm tempted to say that there
should just be a "paranoid" flag to globally die() whenever we see a
corruption (and you could run with it normally, but relax it whenever
you're investigating a broken repo). But I doubt even that works. Not
having the "old_oid" object at all would be a repo corruption here, but
how are the low-level routines supposed to know when a missing object is
a corruption and when it is not?

-Peff

  reply	other threads:[~2017-11-22 22:36 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-20 20:26 [PATCH 0/5] avoiding pointless pack-directory re-scans Jeff King
2017-11-20 20:26 ` [PATCH 1/5] p5550: factor our nonsense-pack creation Jeff King
2017-11-20 23:55   ` Eric Sunshine
2017-11-21 15:58     ` Jeff King
2017-11-22  0:32       ` Stefan Beller
2017-11-22 22:38         ` Jeff King
2017-11-23  2:41           ` Junio C Hamano
2017-11-23  5:02             ` Jeff King
2017-11-20 20:27 ` [PATCH 2/5] t/perf/lib-pack: use fast-import checkpoint to create packs Jeff King
2017-11-20 20:28 ` [PATCH 3/5] p5551: add a script to test fetch pack-dir rescans Jeff King
2017-11-20 20:29 ` [PATCH 4/5] everything_local: use "quick" object existence check Jeff King
2017-11-20 20:35 ` [PATCH 5/5] sha1_file: don't re-scan pack directory for null sha1 Jeff King
2017-11-20 20:47   ` Stefan Beller
2017-11-20 20:58     ` Jeff King
2017-11-21  2:37   ` Junio C Hamano
2017-11-21 22:57     ` Jeff King
2017-11-22  1:42       ` Junio C Hamano
2017-11-22 22:36         ` Jeff King [this message]
2017-11-23  2:35           ` Junio C Hamano
2017-11-24 17:32             ` Jeff King
2017-11-25  3:20               ` Junio C Hamano
2017-11-21  5:20   ` Junio C Hamano
2017-11-21 23:17     ` Jeff King
2017-11-22  1:49       ` Junio C Hamano
2017-11-22  3:17         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171122223613.GA1405@sigill \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.