From: David Michael Barr <davidbarr@google.com>
To: Git Mailing List <git@vger.kernel.org>
Cc: Julian Phillips <julian@quantumfyre.co.uk>,
Martin Fick <mfick@codeaurora.org>,
Junio C Hamano <gitster@pobox.com>,
David Barr <davidbarr@google.com>,
"Shawn O. Pearce" <spearce@spearce.org>
Subject: Re: [PATCH] refs.c: Fix slowness with numerous loose refs
Date: Tue, 27 Sep 2011 12:04:43 +1000 [thread overview]
Message-ID: <CAFfmPPMx9_nRE2Zfg2g0hwzybWDPJARc6LCHbSK8y-uZWQCZqQ@mail.gmail.com> (raw)
In-Reply-To: <1317085283-33943-1-git-send-email-davidbarr@google.com>
+cc Shawn O. Pearce
I used the following to generate a test repo shaped like
a gerrit mirror with unpacked refs (10k, because life is too short for
100k tests):
cd test.git
git init
touch empty
git add empty
git commit -m 'empty'
REV=`git rev-parse HEAD`
for ((d=0;d<100;++d)); do
for ((n=0;n<100;++n)); do
let r=n*100+d
mkdir -p .git/refs/changes/$d/$r
echo $REV > .git/refs/changes/$d/$r/1
done
done
time git branch xyz
With warm caches...
Git 1.7.6.4:
real 0m8.232s
user 0m7.842s
sys 0m0.385s
Git 1.7.6.4, with patch below:
real 0m0.394s
user 0m0.069s
sys 0m0.324s
On Tue, Sep 27, 2011 at 11:01 AM, David Barr <davidbarr@google.com> wrote:
> Martin Fick reported:
> OK, I have found what I believe is another performance
> regression for large ref counts (~100K).
>
> When I run git br on my repo which only has one branch, but
> has ~100K refs under ref/changes (a gerrit repo), it takes
> normally 3-6mins depending on whether my caches are fresh or
> not. After bisecting some older changes, I noticed that
> this ref seems to be where things start to get slow:
> v1.5.2-rc0~21^2 (refs.c: add a function to sort a ref list,
> rather then sorting on add) (Julian Phillips, Apr 17, 2007)
>
> Martin Fick observed that sort_refs_lists() was called almost
> as many times as there were loose refs.
>
> Julian Phillips commented:
> Back when I made that change, I failed to notice that get_ref_dir
> was recursive for subdirectories ... sorry ...
>
> Hopefully this should speed things up. My test repo went from
> ~17m user time, to ~2.5s.
> Packing still make things much faster of course.
>
> Martin Fick acked:
> Excellent! This works (almost, in my refs.c it is called
> sort_ref_list, not sort_refs_list). So, on the non garbage
> collected repo, git branch now takes ~.5s, and in the
> garbage collected one it takes only ~.05s!
>
> [db: summarised transcript, rewrote patch to fix callee not callers]
>
> [attn jch: patch applies to maint]
>
> Analyzed-by: Martin Fick <mfick@codeaurora.org>
> Inspired-by: Julian Phillips <julian@quantumfyre.co.uk>
> Acked-by: Martin Fick <mfick@codeaurora.org>
> Signed-off-by: David Barr <davidbarr@google.com>
> ---
> refs.c | 14 ++++++++++----
> 1 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 4c1fd47..e40a09c 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -255,8 +255,8 @@ static struct ref_list *get_packed_refs(const char *submodule)
> return refs->packed;
> }
>
> -static struct ref_list *get_ref_dir(const char *submodule, const char *base,
> - struct ref_list *list)
> +static struct ref_list *walk_ref_dir(const char *submodule, const char *base,
> + struct ref_list *list)
> {
> DIR *dir;
> const char *path;
> @@ -299,7 +299,7 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
> if (stat(refdir, &st) < 0)
> continue;
> if (S_ISDIR(st.st_mode)) {
> - list = get_ref_dir(submodule, ref, list);
> + list = walk_ref_dir(submodule, ref, list);
> continue;
> }
> if (submodule) {
> @@ -319,7 +319,13 @@ static struct ref_list *get_ref_dir(const char *submodule, const char *base,
> free(ref);
> closedir(dir);
> }
> - return sort_ref_list(list);
> + return list;
> +}
> +
> +static struct ref_list *get_ref_dir(const char *submodule, const char *base,
> + struct ref_list *list)
> +{
> + return sort_ref_list(walk_ref_dir(submodule, base, list));
> }
>
> struct warn_if_dangling_data {
> --
> 1.7.5.75.g69330
>
>
--
David Barr | Software Engineer | davidbarr@google.com | 614-3438-8348
next prev parent reply other threads:[~2011-09-27 2:04 UTC|newest]
Thread overview: 126+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-09 3:44 Git is not scalable with too many refs/* NAKAMURA Takumi
2011-06-09 6:50 ` Sverre Rabbelier
2011-06-09 15:23 ` Shawn Pearce
2011-06-09 15:52 ` A Large Angry SCM
2011-06-09 15:56 ` Shawn Pearce
2011-06-09 16:26 ` Jeff King
2011-06-10 3:59 ` NAKAMURA Takumi
2011-06-13 22:27 ` Jeff King
2011-06-14 0:17 ` Andreas Ericsson
2011-06-14 0:30 ` Jeff King
2011-06-14 4:41 ` Junio C Hamano
2011-06-14 7:26 ` Sverre Rabbelier
2011-06-14 10:02 ` Johan Herland
2011-06-14 10:34 ` Sverre Rabbelier
2011-06-14 17:02 ` Jeff King
2011-06-14 19:20 ` Shawn Pearce
2011-06-14 19:47 ` Jeff King
2011-06-14 20:12 ` Shawn Pearce
2011-09-08 19:53 ` Martin Fick
2011-09-09 0:52 ` Martin Fick
2011-09-09 1:05 ` Thomas Rast
2011-09-09 1:13 ` Thomas Rast
2011-09-09 15:59 ` Jens Lehmann
2011-09-25 20:43 ` Martin Fick
2011-09-26 12:41 ` Christian Couder
2011-09-26 17:47 ` Martin Fick
2011-09-26 18:56 ` Christian Couder
2011-09-30 16:41 ` Martin Fick
2011-09-30 19:26 ` Martin Fick
2011-09-30 21:02 ` Martin Fick
2011-09-30 22:06 ` Martin Fick
2011-10-01 20:41 ` Junio C Hamano
2011-10-02 5:19 ` Michael Haggerty
2011-10-03 0:46 ` Martin Fick
2011-10-04 8:08 ` Michael Haggerty
2011-10-03 18:12 ` Martin Fick
2011-10-03 19:42 ` Junio C Hamano
2011-10-04 8:16 ` Michael Haggerty
2011-10-08 20:59 ` Martin Fick
2011-10-09 5:43 ` Michael Haggerty
2011-09-28 19:38 ` Martin Fick
2011-09-28 22:10 ` Martin Fick
2011-09-29 0:54 ` Julian Phillips
2011-09-29 1:37 ` Martin Fick
2011-09-29 2:19 ` Julian Phillips
2011-09-29 16:38 ` Martin Fick
2011-09-29 18:26 ` Julian Phillips
2011-09-29 18:27 ` René Scharfe
2011-09-29 19:10 ` Junio C Hamano
2011-09-29 4:18 ` [PATCH] refs: Use binary search to lookup refs faster Julian Phillips
2011-09-29 21:57 ` Junio C Hamano
2011-09-29 22:04 ` [PATCH v2] " Julian Phillips
2011-09-29 22:06 ` [PATCH] " Junio C Hamano
2011-09-29 22:11 ` [PATCH v3] " Julian Phillips
2011-09-29 23:48 ` Junio C Hamano
2011-09-30 15:30 ` Michael Haggerty
2011-09-30 16:38 ` Junio C Hamano
2011-09-30 17:56 ` [PATCH] refs: Remove duplicates after sorting with qsort Julian Phillips
2011-10-02 5:15 ` [PATCH v3] refs: Use binary search to lookup refs faster Michael Haggerty
2011-10-02 5:45 ` Junio C Hamano
2011-10-04 20:58 ` Junio C Hamano
2011-09-30 1:13 ` Martin Fick
2011-09-30 3:44 ` Junio C Hamano
2011-09-30 8:04 ` Julian Phillips
2011-09-30 15:45 ` Martin Fick
2011-09-29 20:44 ` Git is not scalable with too many refs/* Martin Fick
2011-09-29 19:10 ` Julian Phillips
2011-09-29 20:11 ` Martin Fick
2011-09-30 9:12 ` René Scharfe
2011-09-30 16:09 ` Martin Fick
2011-09-30 16:52 ` Junio C Hamano
2011-09-30 18:17 ` René Scharfe
2011-10-01 15:28 ` René Scharfe
2011-10-01 15:38 ` [PATCH 1/8] checkout: check for "Previous HEAD" notice in t2020 René Scharfe
2011-10-01 19:02 ` Sverre Rabbelier
2011-10-01 15:43 ` [PATCH 2/8] revision: factor out add_pending_sha1 René Scharfe
2011-10-01 15:51 ` [PATCH 3/8] checkout: use add_pending_{object,sha1} in orphan check René Scharfe
2011-10-01 15:56 ` [PATCH 4/8] revision: add leak_pending flag René Scharfe
2011-10-01 16:01 ` [PATCH 5/8] bisect: use " René Scharfe
2011-10-01 16:02 ` [PATCH 6/8] bundle: " René Scharfe
2011-10-01 16:09 ` [PATCH 7/8] checkout: " René Scharfe
2011-10-01 16:16 ` [PATCH 8/8] commit: factor out clear_commit_marks_for_object_array René Scharfe
2011-09-26 15:15 ` Git is not scalable with too many refs/* Martin Fick
2011-09-26 15:21 ` Sverre Rabbelier
2011-09-26 15:48 ` Martin Fick
2011-09-26 15:56 ` Sverre Rabbelier
2011-09-26 16:38 ` Martin Fick
2011-09-26 16:49 ` Julian Phillips
2011-09-26 18:07 ` Martin Fick
2011-09-26 18:37 ` Julian Phillips
2011-09-26 20:01 ` Martin Fick
2011-09-26 20:07 ` Junio C Hamano
2011-09-26 20:28 ` Julian Phillips
2011-09-26 21:39 ` Martin Fick
2011-09-26 21:52 ` Martin Fick
2011-09-26 23:26 ` Julian Phillips
2011-09-26 23:37 ` David Michael Barr
2011-09-27 1:01 ` [PATCH] refs.c: Fix slowness with numerous loose refs David Barr
2011-09-27 2:04 ` David Michael Barr [this message]
2011-09-26 23:38 ` Git is not scalable with too many refs/* Junio C Hamano
2011-09-27 0:00 ` [PATCH] Don't sort ref_list too early Julian Phillips
2011-10-02 4:58 ` Michael Haggerty
2011-09-27 0:12 ` Git is not scalable with too many refs/* Martin Fick
2011-09-27 0:22 ` Julian Phillips
2011-09-27 2:34 ` Martin Fick
2011-09-27 7:59 ` Julian Phillips
2011-09-27 8:20 ` Sverre Rabbelier
2011-09-27 9:01 ` Julian Phillips
2011-09-27 10:01 ` Sverre Rabbelier
2011-09-27 10:25 ` Nguyen Thai Ngoc Duy
2011-09-27 11:07 ` Michael Haggerty
2011-09-27 12:10 ` Julian Phillips
2011-09-26 22:30 ` Julian Phillips
2011-09-26 15:32 ` Michael Haggerty
2011-09-26 15:42 ` Martin Fick
2011-09-26 16:25 ` Thomas Rast
2011-09-09 13:50 ` Michael Haggerty
2011-09-09 15:51 ` Michael Haggerty
2011-09-09 16:03 ` Jens Lehmann
2011-06-10 7:41 ` Andreas Ericsson
2011-06-10 19:41 ` Shawn Pearce
2011-06-10 20:12 ` Jakub Narebski
2011-06-10 20:35 ` Jeff King
2011-06-13 7:08 ` Andreas Ericsson
2011-06-09 11:18 ` Jakub Narebski
2011-06-09 15:42 ` Stephen Bash
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFfmPPMx9_nRE2Zfg2g0hwzybWDPJARc6LCHbSK8y-uZWQCZqQ@mail.gmail.com \
--to=davidbarr@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=julian@quantumfyre.co.uk \
--cc=mfick@codeaurora.org \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).