All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Grimm <koreth@midwinter.com>
To: 'git' <git@vger.kernel.org>
Subject: [PATCH] Ignore end-of-line style when computing similarity score for rename detection
Date: Wed, 27 Jun 2007 19:39:44 -0700	[thread overview]
Message-ID: <46831F70.2060403@midwinter.com> (raw)

Signed-off-by: Steven Grimm <koreth@midwinter.com>
---
A couple of source files got checked into my code base with DOS-style 
end-of-line characters. I converted them to UNIX-style (the convention 
for this project) in my branch. Then later, I renamed a couple of them.

Meanwhile, back in the original branch, someone else fixed a bug in one 
of the files and checked it in, still with DOS-style line endings.

When I merged that change into my branch, git didn't detect the rename 
because the fact that every line has a change (the end-of-line 
character) dropped the similarity score way too low.

This patch teaches git to ignore end-of-line style when looking for 
potential rename candidates. A separate question, which I expect may be 
more controversial, is what to do with conflict markers; with this 
patch, the entire file is still marked as in conflict if the end-of-line 
style changes (but it's still an improvement in that we at least detect 
the rename now.)


 diffcore-delta.c |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/diffcore-delta.c b/diffcore-delta.c
index 7338a40..10bbf95 100644
--- a/diffcore-delta.c
+++ b/diffcore-delta.c
@@ -143,9 +143,12 @@ static struct spanhash_top *hash_chars(unsigned 
char *buf, unsigned int sz)
                unsigned int c = *buf++;
                unsigned int old_1 = accum1;
                sz--;
-               accum1 = (accum1 << 7) ^ (accum2 >> 25);
-               accum2 = (accum2 << 7) ^ (old_1 >> 25);
-               accum1 += c;
+               /* Ignore \r\n vs. \n when computing similarity. */
+               if (c != '\r') {
+                       accum1 = (accum1 << 7) ^ (accum2 >> 25);
+                       accum2 = (accum2 << 7) ^ (old_1 >> 25);
+                       accum1 += c;
+               }
                if (++n < 64 && c != '\n')
                        continue;
                hashval = (accum1 + accum2 * 0x61) % HASHBASE;
-- 
1.5.2.2.571.ge134

             reply	other threads:[~2007-06-28  2:39 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-28  2:39 Steven Grimm [this message]
2007-06-28  2:46 ` [PATCH] Ignore end-of-line style when computing similarity score for rename detection Steven Grimm
2007-06-28  7:22   ` Johannes Sixt
2007-06-28  8:16     ` Junio C Hamano
2007-06-28  4:29 ` Junio C Hamano
2007-06-28  6:04   ` Steven Grimm
2007-06-28  6:18     ` Shawn O. Pearce
2007-06-29  6:34       ` Junio C Hamano
2007-06-28 12:41     ` Johannes Schindelin
2007-06-28 18:17       ` Steven Grimm
2007-06-29 10:19         ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46831F70.2060403@midwinter.com \
    --to=koreth@midwinter.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.