git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Downing <bdowning@lavos.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: Serious performance regression in diff between 1.6.6 and 1.7.0
Date: Thu, 10 Jun 2010 13:14:21 -0500	[thread overview]
Message-ID: <20100610181421.GC2635@glaurung.lavos.net> (raw)
In-Reply-To: <20100610170804.GB2635@glaurung.lavos.net>

On Thu, Jun 10, 2010 at 12:08:04PM -0500, Brian Downing wrote:
> I also ran this through callgrind to see how often the above were called:

(187,456 files)

>         Calls  Symbol
>   -----------  -------------------
>       197,958  unpack_callback
>       208,460  find_cache_pos
>    37,308,336  ce_in_traverse_path
>   156,950,469  do_compare_entry
>   156,950,469  df_name_compare

Here is an identical run (git-diff HEAD) from the Linux kernel tree
(33,307 files):

        Calls  Symbol
   -----------  -------------------
        35,332  unpack_callback
        37,357  find_cache_pos
     4,979,473  ce_in_traverse_path
     6,828,181  do_compare_entry
     6,828,181  df_name_compare

That makes it look sort of exponential (perhaps around files^1.5),
though from what I can understand of the find_cache_pos code in
unpack-trees it would depend on the exact shape of the repository.  It
does seem to linear-search over whole directory trees of the index
repeatedly, though, which would support the exponential theory.

Unfortunately I don't really understand what the code is trying to do.
Is it not the case that trees and the index are always stored sorted in
the same order?  The examples given in the commit messages that
introduced this fix would imply not, but I'm not sure how that could
come about.

-bcd

  reply	other threads:[~2010-06-10 18:14 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-10  0:10 Serious performance regression in diff between 1.6.6 and 1.7.0 Brian Downing
2010-06-10 17:08 ` Brian Downing
2010-06-10 18:14   ` Brian Downing [this message]
2010-06-10 18:42     ` Brian Downing
2010-06-11  2:59     ` [PATCH] unpack-trees: Make index lookahead less pessimal Brian Downing
2010-06-11  4:26       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100610181421.GC2635@glaurung.lavos.net \
    --to=bdowning@lavos.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).