All of lore.kernel.org
 help / color / mirror / Atom feed
From: SungHyun Nam <goweol@gmail.com>
To: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
Cc: git@vger.kernel.org
Subject: Re: git diff too slow for a file
Date: Mon, 19 Apr 2010 09:43:17 +0900	[thread overview]
Message-ID: <4BCBA725.2020307@gmail.com> (raw)
In-Reply-To: <4BC9D928.50909@lsrfire.ath.cx>

René Scharfe wrote:
> Am 29.03.2010 03:42, schrieb SungHyun Nam:
>> Hello,
>>
>> If I run a attached script for bunzipped attached files, I get:
>> (To reduce size, I removed many lines and bzipped.)
>>
>>      $ ./mk.sh
>>      time diff -u x3 x4>/dev/null 2>&1
>>
>>      real    0m0.011s
>>      user    0m0.000s
>>      sys    0m0.010s
>>
>>      time git diff>/dev/null 2>&1
>>
>>      real    0m0.193s
>>      user    0m0.190s
>>      sys    0m0.000s
>>
>>      $ git version
>>      git version 1.7.0.2.273.gc2413
>>
>>      $ diff --version
>>      diff (GNU diffutils) 2.8.1
>>      ...
>>
>> Well, though the files are ascii file, they includes a random
>> hexa-decimal datas, so that I don't interest the diff result at
>> all.  But the real problem is 'rebasing took so long if the file
>> was changed'.  Because the git tree includes several such a file,
>> if they changed, rebase took some miniutes for every branch.
>> Such a branch includes a few lines of changes for a C source file,
>> though.  Now I'm waiting an hour to finish rebasing all the
>> branches and yet a rebasing script is running... :-(
>
> I can reproduce it; I concatenated your example files five times to get
> meaningful timings (x1 = five times x3, x2 = five times x4).
>
> The difference between GNU diff and git diff is that the latter is trying
> hard to minimize the size of the diff.  Each user of the xdiff library in
> git turns on the XDF_NEED_MINIMAL flag, which makes it very expensive
> (specifically the function xdl_split()).
>
> The following patch is not meant for inclusion, but rather to start a
> dicussion.  Is XDF_NEED_MINIMAL a good default to have?
>
> The patch removes XDF_NEED_MINIMAL and replaces it with XDF_QUICK, with
> reversed meaning.  XDF_QUICK is only set if the new option --quick is
> given, so without it the old behaviour is retained.  Some numbers:

The patch is great for me.  Thanks!

Added 'time git diff --quick' to the mk.sh and ran with a
original file (about 180000 lines):

     $ ./mk.sh
     time diff -u x3 x4 >/dev/null 2>&1

     real	0m0.794s
     user	0m0.720s
     sys	0m0.010s

     time git diff >/dev/null 2>&1

     real	0m44.687s
     user	0m44.670s
     sys	0m0.020s

     time git diff --quick >/dev/null 2>&1

     real	0m1.853s
     user	0m1.840s
     sys	0m0.010s

Thanks!
namsh

      parent reply	other threads:[~2010-04-19  0:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-29  1:42 git diff too slow for a file SungHyun Nam
2010-04-17 15:52 ` René Scharfe
2010-04-17 17:10   ` Junio C Hamano
2010-04-18 18:01     ` René Scharfe
2010-04-20  7:40       ` Junio C Hamano
2010-04-20 21:15         ` René Scharfe
2010-04-21  2:49           ` Junio C Hamano
2010-05-02 13:04         ` René Scharfe
2010-05-02 15:10           ` Junio C Hamano
2010-05-04 20:16             ` René Scharfe
2010-05-04 22:56               ` Junio C Hamano
2010-04-19  0:43   ` SungHyun Nam [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BCBA725.2020307@gmail.com \
    --to=goweol@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=rene.scharfe@lsrfire.ath.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.