All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Davide Libenzi <davidel@xmailserver.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Abhijit Menon-Sen <ams@toroid.org>,
	Pierre Habouzit <madcoder@debian.org>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: absurdly slow git-diff
Date: Fri, 07 Nov 2008 21:30:52 -0800	[thread overview]
Message-ID: <7v7i7eeqcz.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <alpine.DEB.1.10.0811071547080.8736@alien.or.mcafeemobile.com> (Davide Libenzi's message of "Fri, 7 Nov 2008 15:48:34 -0800 (PST)")

Davide Libenzi <davidel@xmailserver.org> writes:

> Yeah, similar. Mine is below. There's one less branch in the for loops.

Thanks, will apply like this, but I am not sure if you meant windowN or
just window...

-- >8 --
From: Davide Libenzi <davidel@xmailserver.org>
Date: Fri, 7 Nov 2008 21:24:33 -0800
Subject: [PATCH] xdiff: give up scanning similar lines early

In a corner case of large files whose lines do not match uniquely, the
loop to eliminate a line that matches multiple locations adjacent to a run
of lines that do not uniquely match wasted too much cycles.  Fix this by
giving up early after scanning 100 lines in both direction.
---
 xdiff/xprepare.c |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e87ab57..6a70cdf 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -23,10 +23,9 @@
 #include "xinclude.h"
 
 
-
 #define XDL_KPDIS_RUN 4
 #define XDL_MAX_EQLIMIT 1024
-
+#define XDL_SIMSCAN_WINDOWN 100
 
 
 typedef struct s_xdlclass {
@@ -313,6 +312,18 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
 	long r, rdis0, rpdis0, rdis1, rpdis1;
 
 	/*
+	 * Limits the window the is examined during the similar-lines
+	 * scan. The loops below stops when dis[i - r] == 1 (line that
+	 * has no match), but there are corner cases where the loop
+	 * proceed all the way to the extremities by causing huge
+	 * performance penalties in case of big files.
+	 */
+	if (i - s > XDL_SIMSCAN_WINDOWN)
+		s = i - XDL_SIMSCAN_WINDOWN;
+	if (e - i > XDL_SIMSCAN_WINDOWN)
+		e = i + XDL_SIMSCAN_WINDOWN;
+
+	/*
 	 * Scans the lines before 'i' to find a run of lines that either
 	 * have no match (dis[j] == 0) or have multiple matches (dis[j] > 1).
 	 * Note that we always call this function with dis[i] > 1, so the
-- 
1.6.0.3.674.gdf99f

  parent reply	other threads:[~2008-11-08  5:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-07 20:01 absurdly slow git-diff Abhijit Menon-Sen
2008-11-07 21:28 ` Mike Hommey
2008-11-07 21:37 ` Linus Torvalds
2008-11-07 23:04   ` Davide Libenzi
2008-11-07 23:18     ` Davide Libenzi
2008-11-07 23:42       ` Linus Torvalds
2008-11-07 23:48         ` Davide Libenzi
2008-11-07 23:57           ` Linus Torvalds
2008-11-08  4:57             ` Abhijit Menon-Sen
2008-11-08 21:02             ` Junio C Hamano
2008-11-08  5:30           ` Junio C Hamano [this message]
2008-11-08 16:27             ` Davide Libenzi
2008-11-08  0:14   ` Pierre Habouzit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7v7i7eeqcz.fsf@gitster.siamese.dyndns.org \
    --to=gitster@pobox.com \
    --cc=ams@toroid.org \
    --cc=davidel@xmailserver.org \
    --cc=git@vger.kernel.org \
    --cc=madcoder@debian.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.