From: Michael Haggerty <mhagger@alum.mit.edu>
To: Martin Fick <mfick@codeaurora.org>
Cc: git@vger.kernel.org, "Christian Couder" <chriscool@tuxfamily.org>,
"Thomas Rast" <trast@student.ethz.ch>,
"René Scharfe" <rene.scharfe@lsrfire.ath.cx>,
"Julian Phillips" <julian@quantumfyre.co.uk>
Subject: Re: Git is not scalable with too many refs/*
Date: Sun, 09 Oct 2011 07:43:47 +0200 [thread overview]
Message-ID: <4E913493.7010101@alum.mit.edu> (raw)
In-Reply-To: <201110081459.52174.mfick@codeaurora.org>
[-- Attachment #1: Type: text/plain, Size: 2441 bytes --]
On 10/08/2011 10:59 PM, Martin Fick wrote:
> [...]
> So, with this in mind, I have discovered, that the fetch
> performance degradation by invalidating the caches in
> write_ref_sha1() is actually due to the packed-refs being
> reloaded and resorted again on each ref insertion (not the
> loose refs)!!!
Good point.
> I think that all of this might explain why no matter how
> good Michael's intentions are with his patch series, his
> series isn't likely to fix this problem
I never claimed that my patch fixes all use cases, or cures cancer
either :-) One step at a time.
> unless he does not
> invalidate the packed-refs after each insertion. I tried
> preventing this invalidation in his series to prove this,
> but unfortunately, it appears that in his series it is no
> longer possible to only invalidate just the packed-refs? :(
> Michael, I hope I am completely wrong about that...
Yes, you are completely wrong. I just implemented more selective cache
invalidation on top of the patch series.
I think your suggestion is safe because only non-symbolic references can
be stored in the packed refs; therefore the modification of a loose ref
can never affect the value of a packed ref. Of course a loose ref can
*hide* the value of a packed ref, but in such cases the packed ref is
never read anyway. And the *deletion* of a loose ref can expose a
previously-hidden packed ref, but this case is handled by delete_ref(),
which explicitly invalidates the packed-ref cache.
While I was at it, I also:
* In delete_ref(), only invalidate the packed reference cache if the
reference that is being deleted actually *is* among the packed references.
* Changed the code to stop invalidating the ref caches for submodules.
In the code paths where the cache invalidation was being done, only
main-module references were being changed. However, I'm not familiar
enough with submodules to know if/when submodule references *can* be
changed. It could be that the submodule reference caches have to be
invalidated under some circumstances; the current code might be buggy in
this area.
The changes are pushed to github. They don't make any significant
difference to my "refperf" results (attached), so perhaps a new
benchmark should be added. But I'm curious to see how they affect your
timings.
Michael
--
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/
[-- Attachment #2: refperf-summary.out --]
[-- Type: text/plain, Size: 5251 bytes --]
=================================== ======= ======= ======= ======= ======= ======= ======= ======= ======= =======
Test name [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
=================================== ======= ======= ======= ======= ======= ======= ======= ======= ======= =======
branch-loose-cold 3.19 3.15 3.10 3.19 3.25 0.70 0.61 0.74 0.66 0.56
branch-loose-warm 0.19 0.19 0.20 0.19 0.19 0.00 0.00 0.00 0.00 0.00
for-each-ref-loose-cold 3.73 3.45 3.55 3.39 3.44 3.40 3.50 3.52 3.70 3.51
for-each-ref-loose-warm 0.44 0.44 0.44 0.43 0.43 0.43 0.43 0.43 0.43 0.43
checkout-loose-cold 3.35 3.23 3.23 3.15 3.29 0.65 0.71 0.76 0.66 0.69
checkout-loose-warm 0.19 0.19 0.20 0.18 0.19 0.01 0.01 0.01 0.01 0.00
checkout-orphan-loose 0.19 0.19 0.19 0.18 0.19 0.00 0.00 0.00 0.00 0.00
checkout-from-detached-loose-cold 7.80 4.17 4.17 4.05 4.09 4.07 4.26 4.23 4.18 4.08
checkout-from-detached-loose-warm 1.01 1.01 1.02 1.02 1.04 1.03 1.04 1.04 1.02 1.04
branch-contains-loose-cold 35.76 35.80 36.15 36.67 35.13 36.29 36.37 36.03 36.70 36.01
branch-contains-loose-warm 33.01 33.62 33.52 33.51 32.41 33.51 33.71 32.10 33.70 31.99
pack-refs-loose 4.19 4.20 4.25 4.21 4.20 4.21 4.20 4.19 4.24 4.21
branch-packed-cold 0.79 0.62 0.60 0.66 0.65 0.58 0.68 0.72 0.60 0.61
branch-packed-warm 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02
for-each-ref-packed-cold 0.96 0.97 0.97 0.93 0.89 0.92 0.98 0.96 0.92 0.96
for-each-ref-packed-warm 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.27 0.27 0.27
checkout-packed-cold 16.14 16.16 16.74 2.04 2.03 2.09 2.06 2.13 2.03 2.00
checkout-packed-warm 0.17 0.17 0.18 0.19 0.18 0.17 0.27 0.18 0.19 0.18
checkout-orphan-packed 0.02 0.01 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02
checkout-from-detached-packed-cold 16.24 15.96 16.80 1.99 2.06 2.01 2.08 2.10 1.97 1.96
checkout-from-detached-packed-warm 15.04 14.96 15.76 0.77 0.81 0.79 0.83 0.80 0.79 0.80
branch-contains-packed-cold 36.18 36.98 36.92 35.19 34.97 35.09 33.34 33.87 34.27 34.51
branch-contains-packed-warm 35.27 35.12 36.20 33.52 32.76 33.49 33.65 32.96 33.68 32.34
clone-loose-cold 9.09 9.22 9.15 9.10 9.19 9.03 9.09 9.25 8.96 9.03
clone-loose-warm 5.57 5.85 5.65 5.55 5.61 5.64 5.65 5.61 5.74 5.59
fetch-nothing-loose 1.43 1.43 1.44 1.44 1.45 1.45 1.46 1.44 1.44 1.44
pack-refs 0.08 0.08 0.08 0.08 0.09 0.08 0.09 0.08 0.08 0.08
fetch-nothing-packed 1.44 1.43 1.44 1.44 1.44 1.44 1.44 1.44 1.44 1.44
clone-packed-cold 1.35 1.26 1.30 1.32 1.28 1.35 1.38 1.35 1.29 1.21
clone-packed-warm 0.36 0.35 0.35 0.36 0.36 0.36 0.35 0.36 0.37 0.35
fetch-everything-cold 30.29 30.01 29.79 29.04 29.84 29.25 29.30 29.26 29.76 29.30
fetch-everything-warm 26.20 26.04 26.40 25.60 26.22 25.83 25.82 25.85 26.68 25.73
=================================== ======= ======= ======= ======= ======= ======= ======= ======= ======= =======
[0] f696543 (tag: v1.7.6) Git 1.7.6
[1] 703f05a (tag: v1.7.7) Git 1.7.7
[2] 27897d2 (origin/master) Merge remote-tracking branch 'gitster/mh/iterate-refs'
[3] 558b49c is_refname_available(): reimplement using do_for_each_ref_in_list()
[4] 1658397 Store references hierarchically
[5] 5f5a126 get_ref_dir(): add a recursive option
[6] a306af1 get_ref_dir(): read one whole directory before descending into subdirs
[7] fd53cf7 add_ref(): change to take a (struct ref_entry *) as second argument
[8] 9944c7f (origin/testing) read_packed_refs(): keep track of the directory being worked in
[9] cb75c57 (origin/ok, origin/hierarchical-refs, origin/HEAD) refs.c: call clear_cached_ref_cache() from repack_without_ref()
next prev parent reply other threads:[~2011-10-09 5:44 UTC|newest]
Thread overview: 126+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-09 3:44 Git is not scalable with too many refs/* NAKAMURA Takumi
2011-06-09 6:50 ` Sverre Rabbelier
2011-06-09 15:23 ` Shawn Pearce
2011-06-09 15:52 ` A Large Angry SCM
2011-06-09 15:56 ` Shawn Pearce
2011-06-09 16:26 ` Jeff King
2011-06-10 3:59 ` NAKAMURA Takumi
2011-06-13 22:27 ` Jeff King
2011-06-14 0:17 ` Andreas Ericsson
2011-06-14 0:30 ` Jeff King
2011-06-14 4:41 ` Junio C Hamano
2011-06-14 7:26 ` Sverre Rabbelier
2011-06-14 10:02 ` Johan Herland
2011-06-14 10:34 ` Sverre Rabbelier
2011-06-14 17:02 ` Jeff King
2011-06-14 19:20 ` Shawn Pearce
2011-06-14 19:47 ` Jeff King
2011-06-14 20:12 ` Shawn Pearce
2011-09-08 19:53 ` Martin Fick
2011-09-09 0:52 ` Martin Fick
2011-09-09 1:05 ` Thomas Rast
2011-09-09 1:13 ` Thomas Rast
2011-09-09 15:59 ` Jens Lehmann
2011-09-25 20:43 ` Martin Fick
2011-09-26 12:41 ` Christian Couder
2011-09-26 17:47 ` Martin Fick
2011-09-26 18:56 ` Christian Couder
2011-09-30 16:41 ` Martin Fick
2011-09-30 19:26 ` Martin Fick
2011-09-30 21:02 ` Martin Fick
2011-09-30 22:06 ` Martin Fick
2011-10-01 20:41 ` Junio C Hamano
2011-10-02 5:19 ` Michael Haggerty
2011-10-03 0:46 ` Martin Fick
2011-10-04 8:08 ` Michael Haggerty
2011-10-03 18:12 ` Martin Fick
2011-10-03 19:42 ` Junio C Hamano
2011-10-04 8:16 ` Michael Haggerty
2011-10-08 20:59 ` Martin Fick
2011-10-09 5:43 ` Michael Haggerty [this message]
2011-09-28 19:38 ` Martin Fick
2011-09-28 22:10 ` Martin Fick
2011-09-29 0:54 ` Julian Phillips
2011-09-29 1:37 ` Martin Fick
2011-09-29 2:19 ` Julian Phillips
2011-09-29 16:38 ` Martin Fick
2011-09-29 18:26 ` Julian Phillips
2011-09-29 18:27 ` René Scharfe
2011-09-29 19:10 ` Junio C Hamano
2011-09-29 4:18 ` [PATCH] refs: Use binary search to lookup refs faster Julian Phillips
2011-09-29 21:57 ` Junio C Hamano
2011-09-29 22:04 ` [PATCH v2] " Julian Phillips
2011-09-29 22:06 ` [PATCH] " Junio C Hamano
2011-09-29 22:11 ` [PATCH v3] " Julian Phillips
2011-09-29 23:48 ` Junio C Hamano
2011-09-30 15:30 ` Michael Haggerty
2011-09-30 16:38 ` Junio C Hamano
2011-09-30 17:56 ` [PATCH] refs: Remove duplicates after sorting with qsort Julian Phillips
2011-10-02 5:15 ` [PATCH v3] refs: Use binary search to lookup refs faster Michael Haggerty
2011-10-02 5:45 ` Junio C Hamano
2011-10-04 20:58 ` Junio C Hamano
2011-09-30 1:13 ` Martin Fick
2011-09-30 3:44 ` Junio C Hamano
2011-09-30 8:04 ` Julian Phillips
2011-09-30 15:45 ` Martin Fick
2011-09-29 20:44 ` Git is not scalable with too many refs/* Martin Fick
2011-09-29 19:10 ` Julian Phillips
2011-09-29 20:11 ` Martin Fick
2011-09-30 9:12 ` René Scharfe
2011-09-30 16:09 ` Martin Fick
2011-09-30 16:52 ` Junio C Hamano
2011-09-30 18:17 ` René Scharfe
2011-10-01 15:28 ` René Scharfe
2011-10-01 15:38 ` [PATCH 1/8] checkout: check for "Previous HEAD" notice in t2020 René Scharfe
2011-10-01 19:02 ` Sverre Rabbelier
2011-10-01 15:43 ` [PATCH 2/8] revision: factor out add_pending_sha1 René Scharfe
2011-10-01 15:51 ` [PATCH 3/8] checkout: use add_pending_{object,sha1} in orphan check René Scharfe
2011-10-01 15:56 ` [PATCH 4/8] revision: add leak_pending flag René Scharfe
2011-10-01 16:01 ` [PATCH 5/8] bisect: use " René Scharfe
2011-10-01 16:02 ` [PATCH 6/8] bundle: " René Scharfe
2011-10-01 16:09 ` [PATCH 7/8] checkout: " René Scharfe
2011-10-01 16:16 ` [PATCH 8/8] commit: factor out clear_commit_marks_for_object_array René Scharfe
2011-09-26 15:15 ` Git is not scalable with too many refs/* Martin Fick
2011-09-26 15:21 ` Sverre Rabbelier
2011-09-26 15:48 ` Martin Fick
2011-09-26 15:56 ` Sverre Rabbelier
2011-09-26 16:38 ` Martin Fick
2011-09-26 16:49 ` Julian Phillips
2011-09-26 18:07 ` Martin Fick
2011-09-26 18:37 ` Julian Phillips
2011-09-26 20:01 ` Martin Fick
2011-09-26 20:07 ` Junio C Hamano
2011-09-26 20:28 ` Julian Phillips
2011-09-26 21:39 ` Martin Fick
2011-09-26 21:52 ` Martin Fick
2011-09-26 23:26 ` Julian Phillips
2011-09-26 23:37 ` David Michael Barr
2011-09-27 1:01 ` [PATCH] refs.c: Fix slowness with numerous loose refs David Barr
2011-09-27 2:04 ` David Michael Barr
2011-09-26 23:38 ` Git is not scalable with too many refs/* Junio C Hamano
2011-09-27 0:00 ` [PATCH] Don't sort ref_list too early Julian Phillips
2011-10-02 4:58 ` Michael Haggerty
2011-09-27 0:12 ` Git is not scalable with too many refs/* Martin Fick
2011-09-27 0:22 ` Julian Phillips
2011-09-27 2:34 ` Martin Fick
2011-09-27 7:59 ` Julian Phillips
2011-09-27 8:20 ` Sverre Rabbelier
2011-09-27 9:01 ` Julian Phillips
2011-09-27 10:01 ` Sverre Rabbelier
2011-09-27 10:25 ` Nguyen Thai Ngoc Duy
2011-09-27 11:07 ` Michael Haggerty
2011-09-27 12:10 ` Julian Phillips
2011-09-26 22:30 ` Julian Phillips
2011-09-26 15:32 ` Michael Haggerty
2011-09-26 15:42 ` Martin Fick
2011-09-26 16:25 ` Thomas Rast
2011-09-09 13:50 ` Michael Haggerty
2011-09-09 15:51 ` Michael Haggerty
2011-09-09 16:03 ` Jens Lehmann
2011-06-10 7:41 ` Andreas Ericsson
2011-06-10 19:41 ` Shawn Pearce
2011-06-10 20:12 ` Jakub Narebski
2011-06-10 20:35 ` Jeff King
2011-06-13 7:08 ` Andreas Ericsson
2011-06-09 11:18 ` Jakub Narebski
2011-06-09 15:42 ` Stephen Bash
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E913493.7010101@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=julian@quantumfyre.co.uk \
--cc=mfick@codeaurora.org \
--cc=rene.scharfe@lsrfire.ath.cx \
--cc=trast@student.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).