From: Junio C Hamano <gitster@pobox.com>
To: Davide Libenzi <davidel@xmailserver.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Abhijit Menon-Sen <ams@toroid.org>,
Pierre Habouzit <madcoder@debian.org>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: absurdly slow git-diff
Date: Fri, 07 Nov 2008 21:30:52 -0800 [thread overview]
Message-ID: <7v7i7eeqcz.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <alpine.DEB.1.10.0811071547080.8736@alien.or.mcafeemobile.com> (Davide Libenzi's message of "Fri, 7 Nov 2008 15:48:34 -0800 (PST)")
Davide Libenzi <davidel@xmailserver.org> writes:
> Yeah, similar. Mine is below. There's one less branch in the for loops.
Thanks, will apply like this, but I am not sure if you meant windowN or
just window...
-- >8 --
From: Davide Libenzi <davidel@xmailserver.org>
Date: Fri, 7 Nov 2008 21:24:33 -0800
Subject: [PATCH] xdiff: give up scanning similar lines early
In a corner case of large files whose lines do not match uniquely, the
loop to eliminate a line that matches multiple locations adjacent to a run
of lines that do not uniquely match wasted too much cycles. Fix this by
giving up early after scanning 100 lines in both direction.
---
xdiff/xprepare.c | 15 +++++++++++++--
1 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c
index e87ab57..6a70cdf 100644
--- a/xdiff/xprepare.c
+++ b/xdiff/xprepare.c
@@ -23,10 +23,9 @@
#include "xinclude.h"
-
#define XDL_KPDIS_RUN 4
#define XDL_MAX_EQLIMIT 1024
-
+#define XDL_SIMSCAN_WINDOWN 100
typedef struct s_xdlclass {
@@ -313,6 +312,18 @@ static int xdl_clean_mmatch(char const *dis, long i, long s, long e) {
long r, rdis0, rpdis0, rdis1, rpdis1;
/*
+ * Limits the window the is examined during the similar-lines
+ * scan. The loops below stops when dis[i - r] == 1 (line that
+ * has no match), but there are corner cases where the loop
+ * proceed all the way to the extremities by causing huge
+ * performance penalties in case of big files.
+ */
+ if (i - s > XDL_SIMSCAN_WINDOWN)
+ s = i - XDL_SIMSCAN_WINDOWN;
+ if (e - i > XDL_SIMSCAN_WINDOWN)
+ e = i + XDL_SIMSCAN_WINDOWN;
+
+ /*
* Scans the lines before 'i' to find a run of lines that either
* have no match (dis[j] == 0) or have multiple matches (dis[j] > 1).
* Note that we always call this function with dis[i] > 1, so the
--
1.6.0.3.674.gdf99f
next prev parent reply other threads:[~2008-11-08 5:32 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-07 20:01 absurdly slow git-diff Abhijit Menon-Sen
2008-11-07 21:28 ` Mike Hommey
2008-11-07 21:37 ` Linus Torvalds
2008-11-07 23:04 ` Davide Libenzi
2008-11-07 23:18 ` Davide Libenzi
2008-11-07 23:42 ` Linus Torvalds
2008-11-07 23:48 ` Davide Libenzi
2008-11-07 23:57 ` Linus Torvalds
2008-11-08 4:57 ` Abhijit Menon-Sen
2008-11-08 21:02 ` Junio C Hamano
2008-11-08 5:30 ` Junio C Hamano [this message]
2008-11-08 16:27 ` Davide Libenzi
2008-11-08 0:14 ` Pierre Habouzit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7v7i7eeqcz.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=ams@toroid.org \
--cc=davidel@xmailserver.org \
--cc=git@vger.kernel.org \
--cc=madcoder@debian.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.