From: Jon Seymour <jon.seymour@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Anders Kaseorg <andersk@mit.edu>, git@vger.kernel.org
Subject: Re: Bugs with detection of renames that are also overwrites
Date: Tue, 23 Feb 2010 23:53:13 +1100 [thread overview]
Message-ID: <2cfc40321002230453q3a593fa6l43850207ebddfc99@mail.gmail.com> (raw)
In-Reply-To: <20100223122257.GA1735@coredump.intra.peff.net>
I came across a similar case recently while testing my rebasing ideas
which lead me to wonder - is there a way to suppress the rename and
overwrite detection? For my purposes, I don't care if the diff is
bulky, I just want it to apply reliably.
jon
On Tue, Feb 23, 2010 at 11:22 PM, Jeff King <peff@peff.net> wrote:
> On Sat, Feb 13, 2010 at 06:46:55PM -0500, Anders Kaseorg wrote:
>
>> One can create a rename that also overwrites an existing file, for example
>> with ‘git mv -f’:
>>
>> $ git init
>> $ seq 100 200 > a; seq 300 400 > b; git add a b
>> $ git commit -m foo; git tag foo
>> $ git mv -f a b
>> $ git commit -m bar; git tag bar
>>
>> Git does not ordinarily detect this as a rename, even with -M.
>
> Right. Git looks at only a subset of the files when looking for renames.
> For straight renames, the set of possible sources is the list of deleted
> files, and the possible destinations are the new files.
>
> Finding copies ("-C") is similar, except that we also consider files
> that were modified but not deleted. And --find-copies-harder will look
> at _all_ files as sources, not just modified ones.
>
> But what you are asking for is to expand the possible destination list
> to include files that were modified but are not new. I don't think there
> is currently a way to do that explicitly.
>
> As you discovered, though, if either the source or destination file has
> changed significantly, we should break them apart using "-B":
>
>> But it can (sometimes*) be forced to detect the rename with -M -B.
>>
>> $ git diff --stat -M -B foo bar
>> a => b | 0
>> 1 files changed, 0 insertions(+), 0 deletions(-)
>
> In which case the rename engine sees the deletion and addition
> separately (they just happen to have the same path name), and therefore
> the path gets added to the source and destination lists.
>
>> However, the resulting patch incorrectly omits the deletion of the
>> overwritten file!
>>
>> $ git diff -M -B foo bar
>> diff --git a/a b/b
>> similarity index 100%
>> rename from a
>> rename to b
>>
>> Therefore, the patch can’t be applied to its own source tree.
>>
>> $ git checkout foo
>> $ git diff -M -B foo bar | git apply
>> error: b: already exists in working directory
>
> Hmm. I think this is a conflict between what the user wants to see and
> what apply wants to see. From the user's perspective, thinking about the
> diff representing a change, "b" was not really deleted. It was simply
> overwritten. But from apply's perspective, the diff is a set of
> instructions, and those instructions do not mention that "b" previously
> existed and was overwritten. So it is right to be cautious and declare a
> conflict.
>
> I'm not sure just throwing a "delete" line in there would be the best
> thing, though. The instructions for apply really need to be "if 'b' has
> this sha1, then it is safe to delete and rename a into place. Otherwise
> it is a conflict". And I'm not sure how we would represent that (I guess
> with a full diff and an "index" line).
>
>> Also, a question: is it possible to get ‘git merge’ to recognize this
>> rename?
>
> No, I don't think there is a way currently. You would need to patch git
> to set opts.break_opt in merge-recursive.c:get_renames, I think.
>
>> (* I say “sometimes” because -B only detects the rewrite if it deems the
>> renamed file to be sufficiently different than the overwritten file. If
>> you use 190 and 390 instead of 200 and 400 above, the rewrite and rename
>> will not be detected, even though the rename would be detected if it was
>> not an overwrite. This also feels like a bug.)
>
> I think you are getting into a philosophical discussion of what is a
> rename, here. If "a" and "b" are very similar, "a" is removed, and "b"
> has the same (or similar) content as "a", was it a rename from "a", or
> was it simply that "b" was changed, possibly to incorporate the
> duplicated items in "a"?
>
> So I don't think it is a bug. But that isn't to say you can't come up
> with a case where it would be nice, during a diff or a merge, to show
> things that way. I mentioned at the beginning of the message that what
> you wanted was to expand the list of destination possibilities. You
> could have a "--find-overwrites" which puts all of the modified files
> into the destination list. You would not want it on by default, though,
> I think, as it would add a lot of computation time to find a somewhat
> rare case.
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2010-02-23 12:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-13 23:46 Bugs with detection of renames that are also overwrites Anders Kaseorg
2010-02-23 12:22 ` Jeff King
2010-02-23 12:53 ` Jon Seymour [this message]
2010-02-23 12:59 ` Jeff King
2010-02-23 19:08 ` Anders Kaseorg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2cfc40321002230453q3a593fa6l43850207ebddfc99@mail.gmail.com \
--to=jon.seymour@gmail.com \
--cc=andersk@mit.edu \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).