From: mhagger@alum.mit.edu
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Martin Fick <mfick@codeaurora.org>,
git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH] Avoid sorting if references are added to ref_cache in order
Date: Thu, 24 May 2012 14:16:50 +0200 [thread overview]
Message-ID: <1337861810-9366-1-git-send-email-mhagger@alum.mit.edu> (raw)
From: Michael Haggerty <mhagger@alum.mit.edu>
The old code allowed many references to be efficiently added to a
single directory, because it just appended the references to the
containing directory unsorted without doing any searching (and
therefore without requiring any intermediate sorting). But the old
code was inefficient when a large number of subdirectories were added
to a directory, because the directory always had to be searched to see
if the new subdirectory already existed, and this search required the
directory to be sorted first. The same was repeated for every new
subdirectory, so the time scaled like O(N^2), where N is the number of
subdirectories within a single directory.
In practice, references are often added to the ref_cache in
lexicographic order, for example when reading the packed-refs file.
So build some intelligence into add_entry_to_dir() to optimize for the
case of references and/or subdirectories being added in lexicographic
order: if the existing entries were already sorted, and the new entry
comes after the last existing entry, then adjust ref_dir::sorted to
reflect the fact that the ref_dir is still sorted.
Thanks to Peff for pointing out the performance regression that
inspired this change.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
This patch can be applied to either mh/ref-api or next. It should fix
the performance regression discovered by Peff [1].
[1] http://article.gmane.org/gmane.comp.version-control.git/198163
refs.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/refs.c b/refs.c
index 09322fe..98f6425 100644
--- a/refs.c
+++ b/refs.c
@@ -208,6 +208,12 @@ static void add_entry_to_dir(struct ref_dir *dir, struct ref_entry *entry)
{
ALLOC_GROW(dir->entries, dir->nr + 1, dir->alloc);
dir->entries[dir->nr++] = entry;
+ /* optimize for the case that entries are added in order */
+ if (dir->nr == 1 ||
+ (dir->nr == dir->sorted + 1 &&
+ strcmp(dir->entries[dir->nr - 2]->name,
+ dir->entries[dir->nr - 1]->name) < 0))
+ dir->sorted = dir->nr;
}
/*
--
1.7.10
next reply other threads:[~2012-05-24 12:17 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-24 12:16 mhagger [this message]
2012-05-24 17:49 ` [PATCH] Avoid sorting if references are added to ref_cache in order Jeff King
2012-05-24 21:00 ` Michael Haggerty
2012-05-24 21:10 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1337861810-9366-1-git-send-email-mhagger@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mfick@codeaurora.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).