All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: git diff -B: splitting up complete renames
Date: Sat, 05 Jan 2008 11:18:20 -0800	[thread overview]
Message-ID: <7vprwg84jn.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: 200801051818.40009.jnareb@gmail.com

Jakub Narebski <jnareb@gmail.com> writes:

> ...
>   and if it detects that the file "file0" is completely rewritten,
>   it changes it to:
>
>   ------------------------------------------------
>   :100644 000000 bcd1234... 0000000... D file0
>   :000000 100644 0000000... 0123456... A file0
>   ------------------------------------------------
>
> Shouldn't the last block read for the modern git read:
>
>   ------------------------------------------------
>   :100644 000000 bcd1234... 0123456... M99 file0
>   ------------------------------------------------

Correct.  The description is based on an earlier round of the
code before 366175ef8c3b1e145f4ba846e63a1dea3ec3cacc (Rework -B
output).  "git show" of that patch has description in its
changes to Documentation/diffcore.txt.

The original motivation of -B was not about helping three way
renames it allows you when used with -M/-C.  It was about
showing textual diff more readably for quite dissimilar
filepairs.

> Third: Do "git diff --no-index" (filesystem diff) can show breaking / 
> use dissimilarity? I couldn't make it work...

It doesn't, for two reasons.

 * Look at diffcore_break().  It breaks filepairs with identical
   paths and nothing else.

   The reason for this check is that the caller can call
   diffcore_rename() _before_ diffcore_break() [*1*] and a
   filepair that was detected as renamed pair ought to be
   already found similar enough.  These days, I think you can
   replace the check with p->renamed_pair.

 * Look at diffcore_merge_broken(), which is responsible for
   re-merging a broken pair.  Because we match them by looking
   at their names (iow, we do not use pointers to link halves of
   a broken pair pointing at each other).

   This was correct before "no-index" stuff because "git diff"
   was about comparing pairs of blobs found at corresponding
   paths in two tree-like things (i.e. "a/one" and "b/one"
   corresponds to each other when you do "git diff -- one").
   The modification to introduce no-index forgot to update this
   logic.

If you wanted to fix this, you can change p->broken_pair from a
boolean to a pointer that points at the other half of the broken
pair (and record that when we break a pair in diffcore_break()),
and look at that pointer to decide which one to match up with
which other one in diffcore_merge_broken().  Together with a
change to look at p->renamed_pair instead of paths I mentioned,
I think it would work more like the regular git diff for
"no-index" case.

I consider that "no-index" frontend more or less a bolted on
half-baked hack that covers only minimally necessary cases to
serve as non-git "diff" replacement, without sharing enough with
the real "git diff" internals; I would not be surprised at all
if there are more corner cases like this that does not work.
But I do not think it is fundamentally unfixable.  The change to
add "no-index" support just needed to be more careful, and it is
not too late to make it so.


[Footnote]

*1* diffcore was designed as a generic library to allow
experimenting more combinations of transformations than what
then-current git used.  All of our existing callers ended up
calling the transformations in the same order (i.e. the order
diffcore_std() calls them), but individual transformations were
written not to assume too much about the order they would be
called.

      parent reply	other threads:[~2008-01-05 19:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-05 17:18 git diff -B: splitting up complete renames Jakub Narebski
2008-01-05 17:37 ` Linus Torvalds
2008-01-05 19:18 ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vprwg84jn.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.