All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Subject: [Git BUG] Please do not use "-B -M" in "diff" family for now
Date: Sat, 31 Jan 2015 11:12:34 -0800	[thread overview]
Message-ID: <xmqqegqaahnh.fsf@gitster.dls.corp.google.com> (raw)

Please avoid the combination "-B -M" when running "diff" family of
commands, as it can produce incorrect results [*1*] in corner cases.
Use of either "-B" or "-M" by itself is fine, but not both at the
same time.

This problem exists even in Git v1.7.12, and I have no reason to
suspect that it worked correctly in any earlier versions, so I do
not consider it an urgent issue to fix during the pre-release for
upcoming Git v2.3 release.

End of TL;DR.

For a simple reproduction, go to your copy of the kernel tree and do
this:

    $ git diff -B -M v2.6.13 v2.6.12 -- \
        arch/ppc64/kernel/rtas_pci.c arch/ppc64/kernel/pSeries_pci.c >:patch

    $ git reset --hard
    $ git checkout v2.6.13

    $ git apply --cached --whitespace=nowarn :patch
    error: arch/ppc64/kernel/pSeries_pci.c: already exists in index

This is not a bug in "apply", but in "diff".  The resulting patch
looks like this:

    $ git apply --whitespace=nowarn --numstat --summary :patch
    112     5       arch/ppc64/kernel/pSeries_pci.c
     rename arch/ppc64/kernel/{rtas_pci.c => pSeries_pci.c} (81%)

That is, it wants to rename rtas_pci.c to pSeries_pci.c with a bit
of editing.

However, what really happens when going from 2.6.13 to 2.6.12 is
this:

    $ git diff v2.6.13 v2.6.12 -- \
        arch/ppc64/kernel/rtas_pci.c arch/ppc64/kernel/pSeries_pci.c |
        git apply --whitespace=nowarn --numstat --summary
    478     19      arch/ppc64/kernel/pSeries_pci.c
    0       495     arch/ppc64/kernel/rtas_pci.c
     delete mode 100644 arch/ppc64/kernel/rtas_pci.c

That is:

    #1 rtas_pci.c exists in 2.6.13 but not 2.6.12.

    #2 pSeries_pci.c exists in both 2.6.12 and 2.6.13 but is majorly
       rewritten; in fact, the difference between rtas_pci.c in
       2.6.13 and pSeries_pci.c in 2.6.12 is much smaller than the
       difference between pSeries_pci.c from 2.6.13 and 2.6.12.

What seems to happen is that "diff -B" splits the above #2 into
"removal of pSeries_pci.c with contents from 2.6.13" and "creation
of pSeries_pci.c with contents from 2.6.12" and these gets further
combined with the "removal of rtas_pci.c" [*2*].  In the end result,
we incorrectly get "rename rtas_pci.c to create pSeries_pci.c with
some changes".  We shouldn't do this because pSeries_pci.c is not
created and is not a new file.

[Footnotes]

*1* "git apply" refuses to apply the output affected by the bug, so
    at least this will not lead to silent corruption.

*2* The "-B" option was introduced solely to find a possible
    rename/copy source to express this sequence:

    $ cp A B
    $ edit B slightly ;# optional
    $ edit A heavily

as if it was done like this instead:

    $ mv A B
    $ edit B slightly ;# optional
    $ create B from scratch

Without "-B", the change would be expressed as "Heavily edit A,
create B from scratch", and with "-B", we would say "Create B by
copying from A and then edit, and edit A heavily", which would
result in more readable patch (in fact, we would see in the output
of "diff -B -M v2.6.12 v2.6.13" exactly that).

             reply	other threads:[~2015-01-31 19:12 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-31 19:12 Junio C Hamano [this message]
2015-02-02  3:18 ` [RFC PATCH] diff: do not use creation-half of -B as a rename target candidate Junio C Hamano
2015-02-02  5:52   ` Stefan Beller
2015-02-02  6:47     ` Yue Lin Ho
2015-02-02 18:25   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqegqaahnh.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.