git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [BUG] diffcore-rename with duplicate tree entries can segfault
Date: Tue, 24 Feb 2015 17:49:18 -0500	[thread overview]
Message-ID: <20150224224918.GA24749@peff.net> (raw)
In-Reply-To: <xmqqh9uborrx.fsf@gitster.dls.corp.google.com>

On Tue, Feb 24, 2015 at 02:42:42PM -0800, Junio C Hamano wrote:

> > That does fix this problem, and it doesn't break any other tests. But
> > frankly, I don't know what I'm doing and this feels like a giant hack.
> >
> > Given that this is tangentially related to the "-B -M" stuff you've been
> > looking at (and it's your code in the first place :) ), I thought you
> > might have some insight.
> 
> Indeed.
> 
> Honestly, I'd rather see us diagnose duplicate filepairs as an error
> and drop them on the floor upon entry to the diffcore_std(), even
> before we go into the rename codepath.

Yeah, I had a similar thought. Just saying "your diff is broken, we
can't do rename detection" is totally reasonable to me.

My main concern with that approach is that we would waste time finding
the duplicate paths, for something that comes up only rarely. At the
time of locate_rename_dst, we've already created a mapping, and it's
very easy to detect the duplicates. But before that, we have only the
linear list of queued items.

In theory they're sorted and we could do an O(n) pass to find
duplicates. But I'm not sure if the sorting holds in the face of other
breakages (like unsorted trees; they also shouldn't happen, but the
whole point here is to gracefully handle things that shouldn't).

I dunno. Maybe we could do an O(n) pass to check sort order and
uniqueness. If either fails (which should be rare), then we sort and
re-check uniqueness.

I'm assuming there _is_ a sane sort order. We have two halves of a
filepair, but I think before any of the rename or break detection kicks
in, each pair should either:

  1. Have a name in pair->one, and an invalid filespec in pair->two
     (i.e., a deletion).

  2. The opposite (name in pair->two, /dev/null in pair->one). An
     addition.

  3. The same name in pair->one and pair->two.

-Peff

  reply	other threads:[~2015-02-24 22:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-24 21:43 [BUG] diffcore-rename with duplicate tree entries can segfault Jeff King
2015-02-24 22:42 ` Junio C Hamano
2015-02-24 22:49   ` Jeff King [this message]
2015-02-24 23:11     ` Junio C Hamano
2015-02-24 23:47       ` Jeff King
2015-02-25  5:00         ` Junio C Hamano
2015-02-25 21:40           ` Jeff King
2015-02-25 21:50             ` Junio C Hamano
2015-02-27  1:38               ` [PATCH 0/2] " Jeff King
2015-02-27  1:39                 ` [PATCH 1/2] diffcore-rename: split locate_rename_dst into two functions Jeff King
2015-02-27  1:42                 ` [PATCH 2/2] diffcore-rename: avoid processing duplicate destinations Jeff King
2015-02-27 21:48                   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150224224918.GA24749@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).