git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: git diff -B: splitting up complete renames
Date: Sat, 05 Jan 2008 11:18:20 -0800	[thread overview]
Message-ID: <7vprwg84jn.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: 200801051818.40009.jnareb@gmail.com

Jakub Narebski <jnareb@gmail.com> writes:

> ...
>   and if it detects that the file "file0" is completely rewritten,
>   it changes it to:
>
>   ------------------------------------------------
>   :100644 000000 bcd1234... 0000000... D file0
>   :000000 100644 0000000... 0123456... A file0
>   ------------------------------------------------
>
> Shouldn't the last block read for the modern git read:
>
>   ------------------------------------------------
>   :100644 000000 bcd1234... 0123456... M99 file0
>   ------------------------------------------------

Correct.  The description is based on an earlier round of the
code before 366175ef8c3b1e145f4ba846e63a1dea3ec3cacc (Rework -B
output).  "git show" of that patch has description in its
changes to Documentation/diffcore.txt.

The original motivation of -B was not about helping three way
renames it allows you when used with -M/-C.  It was about
showing textual diff more readably for quite dissimilar
filepairs.

> Third: Do "git diff --no-index" (filesystem diff) can show breaking / 
> use dissimilarity? I couldn't make it work...

It doesn't, for two reasons.

 * Look at diffcore_break().  It breaks filepairs with identical
   paths and nothing else.

   The reason for this check is that the caller can call
   diffcore_rename() _before_ diffcore_break() [*1*] and a
   filepair that was detected as renamed pair ought to be
   already found similar enough.  These days, I think you can
   replace the check with p->renamed_pair.

 * Look at diffcore_merge_broken(), which is responsible for
   re-merging a broken pair.  Because we match them by looking
   at their names (iow, we do not use pointers to link halves of
   a broken pair pointing at each other).

   This was correct before "no-index" stuff because "git diff"
   was about comparing pairs of blobs found at corresponding
   paths in two tree-like things (i.e. "a/one" and "b/one"
   corresponds to each other when you do "git diff -- one").
   The modification to introduce no-index forgot to update this
   logic.

If you wanted to fix this, you can change p->broken_pair from a
boolean to a pointer that points at the other half of the broken
pair (and record that when we break a pair in diffcore_break()),
and look at that pointer to decide which one to match up with
which other one in diffcore_merge_broken().  Together with a
change to look at p->renamed_pair instead of paths I mentioned,
I think it would work more like the regular git diff for
"no-index" case.

I consider that "no-index" frontend more or less a bolted on
half-baked hack that covers only minimally necessary cases to
serve as non-git "diff" replacement, without sharing enough with
the real "git diff" internals; I would not be surprised at all
if there are more corner cases like this that does not work.
But I do not think it is fundamentally unfixable.  The change to
add "no-index" support just needed to be more careful, and it is
not too late to make it so.


[Footnote]

*1* diffcore was designed as a generic library to allow
experimenting more combinations of transformations than what
then-current git used.  All of our existing callers ended up
calling the transformations in the same order (i.e. the order
diffcore_std() calls them), but individual transformations were
written not to assume too much about the order they would be
called.

      parent reply	other threads:[~2008-01-05 19:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-05 17:18 git diff -B: splitting up complete renames Jakub Narebski
2008-01-05 17:37 ` Linus Torvalds
2008-01-05 19:18 ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vprwg84jn.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).