git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: [PATCH 8/8] diff: optionally use rename cache
Date: Sat, 4 Aug 2012 13:14:21 -0400	[thread overview]
Message-ID: <20120804171421.GH19378@sigill.intra.peff.net> (raw)
In-Reply-To: <20120804170905.GA19267@sigill.intra.peff.net>

This speeds up estimate_similarity by caching the similarity
score of pairs of blob sha1s.

Signed-off-by: Jeff King <peff@peff.net>
---
Some interesting things to time with this are:

  - "git log --raw -M" on a repo with a lot of paths or a lot of renames
    (I found on git.git, the speedup was not that impressive)

  - "git log --raw -C -C" on any repo (this speeds up a lot in git.git).

  - "git show -M" on commits with very large blobs

 cache.h           |  1 +
 diff.c            |  6 ++++++
 diffcore-rename.c | 11 ++++++++++-
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 23a2f93..7ee1caf 100644
--- a/cache.h
+++ b/cache.h
@@ -1228,6 +1228,7 @@ int add_files_to_cache(const char *prefix, const char **pathspec, int flags);
 
 /* diff.c */
 extern int diff_auto_refresh_index;
+extern int diff_cache_renames;
 
 /* match-trees.c */
 void shift_tree(const unsigned char *, const unsigned char *, unsigned char *, int);
diff --git a/diff.c b/diff.c
index 95706a5..c84e043 100644
--- a/diff.c
+++ b/diff.c
@@ -34,6 +34,7 @@ static int diff_no_prefix;
 static int diff_stat_graph_width;
 static int diff_dirstat_permille_default = 30;
 static struct diff_options default_diff_options;
+int diff_cache_renames;
 
 static char diff_colors[][COLOR_MAXLEN] = {
 	GIT_COLOR_RESET,
@@ -214,6 +215,11 @@ int git_diff_basic_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(var, "diff.cacherenames")) {
+		diff_cache_renames = git_config_bool(var, value);
+		return 0;
+	}
+
 	if (!prefixcmp(var, "submodule."))
 		return parse_submodule_config_option(var, value);
 
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 216a7a4..611e1d3 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -6,6 +6,7 @@
 #include "diffcore.h"
 #include "hash.h"
 #include "progress.h"
+#include "metadata-cache.h"
 
 /* Table of rename/copy destinations */
 
@@ -137,7 +138,8 @@ static int estimate_similarity(struct diff_filespec *src,
 	 */
 	unsigned long max_size, delta_size, base_size, src_copied, literal_added;
 	unsigned long delta_limit;
-	int score;
+	uint32_t score;
+	struct sha1pair pair;
 
 	/* We deal only with regular files.  Symlink renames are handled
 	 * only when they are exact matches --- in other words, no edits
@@ -175,6 +177,11 @@ static int estimate_similarity(struct diff_filespec *src,
 	if (max_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE)
 		return 0;
 
+	hashcpy(pair.one, src->sha1);
+	hashcpy(pair.two, dst->sha1);
+	if (diff_cache_renames && rename_cache_get(&pair, &score))
+			return score;
+
 	if (!src->cnt_data && diff_populate_filespec(src, 0))
 		return 0;
 	if (!dst->cnt_data && diff_populate_filespec(dst, 0))
@@ -195,6 +202,8 @@ static int estimate_similarity(struct diff_filespec *src,
 		score = 0; /* should not happen */
 	else
 		score = (int)(src_copied * MAX_SCORE / max_size);
+	if (diff_cache_renames)
+		rename_cache_set(&pair, score);
 	return score;
 }
 
-- 
1.7.12.rc1.7.g7a223a6

      parent reply	other threads:[~2012-08-04 17:14 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-31 14:15 [WIP PATCH] Manual rename correction Nguyen Thai Ngoc Duy
2012-07-31 16:32 ` Junio C Hamano
2012-07-31 19:23   ` Jeff King
2012-07-31 20:20     ` Junio C Hamano
2012-08-01  0:42       ` Jeff King
2012-08-01  6:01         ` Junio C Hamano
2012-08-01 21:54           ` Jeff King
2012-08-01 22:10             ` Junio C Hamano
2012-08-02 22:37               ` Jeff King
2012-08-02 22:51                 ` Junio C Hamano
2012-08-02 22:58                   ` Jeff King
2012-08-02  5:33             ` Junio C Hamano
2012-08-01  1:10     ` Nguyen Thai Ngoc Duy
2012-08-01  2:01       ` Jeff King
2012-08-01  4:36         ` Nguyen Thai Ngoc Duy
2012-08-01  6:09           ` Junio C Hamano
2012-08-01  6:34             ` Nguyen Thai Ngoc Duy
2012-08-01 21:32               ` Jeff King
2012-08-01 21:27           ` Jeff King
2012-08-02 12:08             ` Nguyen Thai Ngoc Duy
2012-08-02 22:41               ` Jeff King
2012-08-04 17:09                 ` [PATCH 0/8] caching rename results Jeff King
2012-08-04 17:10                   ` [PATCH 1/8] implement generic key/value map Jeff King
2012-08-04 22:58                     ` Junio C Hamano
2012-08-06 20:35                       ` Jeff King
2012-08-04 17:10                   ` [PATCH 2/8] map: add helper functions for objects as keys Jeff King
2012-08-04 17:11                   ` [PATCH 3/8] fast-export: use object to uint32 map instead of "decorate" Jeff King
2012-08-04 17:11                   ` [PATCH 4/8] decorate: use "map" for the underlying implementation Jeff King
2012-08-04 17:11                   ` [PATCH 5/8] map: implement persistent maps Jeff King
2012-08-04 17:11                   ` [PATCH 6/8] implement metadata cache subsystem Jeff King
2012-08-04 22:49                     ` Junio C Hamano
2012-08-06 20:31                       ` Jeff King
2012-08-06 20:38                     ` Jeff King
2012-08-04 17:12                   ` [PATCH 7/8] implement rename cache Jeff King
2012-08-04 17:14                   ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120804171421.GH19378@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).