git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Yann Dirson <ydirson@altern.org>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/3] Introduce rename factorization in diffcore.
Date: Thu, 06 Nov 2008 17:10:09 -0800	[thread overview]
Message-ID: <7vy6zwgx3i.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <20081101220319.1116.50509.stgit@gandelf.nowhere.earth> (Yann Dirson's message of "Sat, 01 Nov 2008 23:03:20 +0100")

Yann Dirson <ydirson@altern.org> writes:

> Rename factorization tries to group together files moving from and to
> identical directories - the most common case being directory renames.
> We do that by first identifying groups of bulk-moved files, and then
> hiding those of the individual renames which carry no other
> information (further name change, or content changes).
> This feature is activated by the new --factorize-renames diffcore
> flag.

I have a mixed feeling about this one, primarily because I cannot
visualize how a useful output should look like.  Unless you rename one
directory to another without any content changes, you would have to say
"this directory changed to that, and among the paths underneath them, this
file have this content change in addition".

A related feature that would benefit from something like your change
without any downside/complication of output format issues is to boost
rename similarity score of a path when its neighbouring paths are moved to
the same location.  E.g. when you see:

 - three files a/{1,2,3} deleted;
 - three files b/{1,2,3} created;
 - (a/1 => b/1) and (a/2 => b/2) are similar enough;
 - (a/3 => b/3) are not similar enough.

we currently detect only two renames and leave deletion of a/3 and
creation of b/3 unpaired.  You should be able to help them paired up by
noticing that the entire a/* goes away (for that, reading the full
postimage like you do in your patch helps) and boost the similarity score
between these two.

Although I do not offhand think a good format to show the information you
are trying to capture in the textual diff output, one thing that would be
helped by the grouping of renames like you do would be process_renames()
in merge_recursive.c.  This is especially so when you have added a new
path in a directory that has been moved by the other branch you are
merging.  For this usage, there is no "textual output format" issues.  It
does not even have to be expressed by replacing individual entries from
diffq with entries that represent a whole subtree --- you could for
example keep what diffq.queue records intact, and add a separate list of
directory renames as a hint for users like process_renames() to use.

  reply	other threads:[~2008-11-07  1:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-01 22:03 [PATCH v3 0/3] Detection of directory renames Yann Dirson
2008-11-01 22:03 ` [PATCH 1/3] Fix error in diff_filepair::status documentation Yann Dirson
2008-11-02  6:31   ` Junio C Hamano
2008-11-01 22:03 ` [PATCH 2/3] Introduce rename factorization in diffcore Yann Dirson
2008-11-07  1:10   ` Junio C Hamano [this message]
2008-11-07 19:39     ` Yann Dirson
2008-11-07 20:19       ` Junio C Hamano
2008-11-07 20:39         ` Yann Dirson
2008-11-07 21:11           ` Junio C Hamano
2008-11-07 22:12             ` Yann Dirson
2008-11-07 23:43               ` Junio C Hamano
2008-11-08  0:29                 ` Yann Dirson
2008-11-08  0:47                   ` Junio C Hamano
2008-11-08  0:50                     ` Yann Dirson
2008-11-01 22:03 ` [PATCH 3/3] Add testcases for the --factorize-renames diffcore flag Yann Dirson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vy6zwgx3i.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=ydirson@altern.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).