From: Steven Grimm <koreth@midwinter.com>
To: git@vger.kernel.org
Subject: [PATCH] Ignore end-of-line style when computing similarity score for rename detection
Date: Wed, 27 Jun 2007 19:46:03 -0700 [thread overview]
Message-ID: <20070628024603.GA1534@midwinter.com> (raw)
In-Reply-To: <46831F70.2060403@midwinter.com>
Signed-off-by: Steven Grimm <koreth@midwinter.com>
---
Okay, let's try this again with an MUA that won't change my tabs to
spaces -- sorry about that.
A couple of source files got checked into my code base with DOS-style
end-of-line characters. I converted them to UNIX-style (the convention
for this project) in my branch. Then later, I renamed a couple of them.
Meanwhile, back in the original branch, someone else fixed a bug in one
of the files and checked it in, still with DOS-style line endings.
When I merged that change into my branch, git didn't detect the rename
because the fact that every line has a change (the end-of-line
character) dropped the similarity score way too low.
This patch teaches git to ignore end-of-line style when looking for
potential rename candidates. A separate question, which I expect may be
more controversial, is what to do with conflict markers; with this
patch, the entire file is still marked as in conflict if the end-of-line
style changes (but it's still an improvement in that we at least detect
the rename now.)
diffcore-delta.c | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/diffcore-delta.c b/diffcore-delta.c
index 7338a40..10bbf95 100644
--- a/diffcore-delta.c
+++ b/diffcore-delta.c
@@ -143,9 +143,12 @@ static struct spanhash_top *hash_chars(unsigned char *buf, unsigned int sz)
unsigned int c = *buf++;
unsigned int old_1 = accum1;
sz--;
- accum1 = (accum1 << 7) ^ (accum2 >> 25);
- accum2 = (accum2 << 7) ^ (old_1 >> 25);
- accum1 += c;
+ /* Ignore \r\n vs. \n when computing similarity. */
+ if (c != '\r') {
+ accum1 = (accum1 << 7) ^ (accum2 >> 25);
+ accum2 = (accum2 << 7) ^ (old_1 >> 25);
+ accum1 += c;
+ }
if (++n < 64 && c != '\n')
continue;
hashval = (accum1 + accum2 * 0x61) % HASHBASE;
--
1.5.2.2.571.ge134
next prev parent reply other threads:[~2007-06-28 2:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-28 2:39 [PATCH] Ignore end-of-line style when computing similarity score for rename detection Steven Grimm
2007-06-28 2:46 ` Steven Grimm [this message]
2007-06-28 7:22 ` Johannes Sixt
2007-06-28 8:16 ` Junio C Hamano
2007-06-28 4:29 ` Junio C Hamano
2007-06-28 6:04 ` Steven Grimm
2007-06-28 6:18 ` Shawn O. Pearce
2007-06-29 6:34 ` Junio C Hamano
2007-06-28 12:41 ` Johannes Schindelin
2007-06-28 18:17 ` Steven Grimm
2007-06-29 10:19 ` Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070628024603.GA1534@midwinter.com \
--to=koreth@midwinter.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.