git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Martin Fick <mfick@codeaurora.org>
Cc: git@vger.kernel.org, "Christian Couder" <chriscool@tuxfamily.org>,
	"Thomas Rast" <trast@student.ethz.ch>,
	"René Scharfe" <rene.scharfe@lsrfire.ath.cx>,
	"Julian Phillips" <julian@quantumfyre.co.uk>
Subject: Re: Git is not scalable with too many refs/*
Date: Sun, 09 Oct 2011 07:43:47 +0200	[thread overview]
Message-ID: <4E913493.7010101@alum.mit.edu> (raw)
In-Reply-To: <201110081459.52174.mfick@codeaurora.org>

[-- Attachment #1: Type: text/plain, Size: 2441 bytes --]

On 10/08/2011 10:59 PM, Martin Fick wrote:
> [...]
> So, with this in mind, I have discovered, that the fetch 
> performance degradation by invalidating the caches in 
> write_ref_sha1() is actually due to the packed-refs being 
> reloaded and resorted again on each ref insertion (not the 
> loose refs)!!!

Good point.

> I think that all of this might explain why no matter how 
> good Michael's intentions are with his patch series, his 
> series isn't likely to fix this problem

I never claimed that my patch fixes all use cases, or cures cancer
either :-)  One step at a time.

>                                         unless he does not
> invalidate the packed-refs after each insertion.  I tried 
> preventing this invalidation in his series to prove this, 
> but unfortunately, it appears that in his series it is no 
> longer possible to only invalidate just the packed-refs? :(
> Michael, I hope I am completely wrong about that...

Yes, you are completely wrong.  I just implemented more selective cache
invalidation on top of the patch series.

I think your suggestion is safe because only non-symbolic references can
be stored in the packed refs; therefore the modification of a loose ref
can never affect the value of a packed ref.  Of course a loose ref can
*hide* the value of a packed ref, but in such cases the packed ref is
never read anyway.  And the *deletion* of a loose ref can expose a
previously-hidden packed ref, but this case is handled by delete_ref(),
which explicitly invalidates the packed-ref cache.

While I was at it, I also:

* In delete_ref(), only invalidate the packed reference cache if the
reference that is being deleted actually *is* among the packed references.

* Changed the code to stop invalidating the ref caches for submodules.
In the code paths where the cache invalidation was being done, only
main-module references were being changed.  However, I'm not familiar
enough with submodules to know if/when submodule references *can* be
changed.  It could be that the submodule reference caches have to be
invalidated under some circumstances; the current code might be buggy in
this area.

The changes are pushed to github.  They don't make any significant
difference to my "refperf" results (attached), so perhaps a new
benchmark should be added.  But I'm curious to see how they affect your
timings.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

[-- Attachment #2: refperf-summary.out --]
[-- Type: text/plain, Size: 5251 bytes --]

===================================  =======  =======  =======  =======  =======  =======  =======  =======  =======  =======
Test name                                [0]      [1]      [2]      [3]      [4]      [5]      [6]      [7]      [8]      [9]
===================================  =======  =======  =======  =======  =======  =======  =======  =======  =======  =======
branch-loose-cold                       3.19     3.15     3.10     3.19     3.25     0.70     0.61     0.74     0.66     0.56
branch-loose-warm                       0.19     0.19     0.20     0.19     0.19     0.00     0.00     0.00     0.00     0.00
for-each-ref-loose-cold                 3.73     3.45     3.55     3.39     3.44     3.40     3.50     3.52     3.70     3.51
for-each-ref-loose-warm                 0.44     0.44     0.44     0.43     0.43     0.43     0.43     0.43     0.43     0.43
checkout-loose-cold                     3.35     3.23     3.23     3.15     3.29     0.65     0.71     0.76     0.66     0.69
checkout-loose-warm                     0.19     0.19     0.20     0.18     0.19     0.01     0.01     0.01     0.01     0.00
checkout-orphan-loose                   0.19     0.19     0.19     0.18     0.19     0.00     0.00     0.00     0.00     0.00
checkout-from-detached-loose-cold       7.80     4.17     4.17     4.05     4.09     4.07     4.26     4.23     4.18     4.08
checkout-from-detached-loose-warm       1.01     1.01     1.02     1.02     1.04     1.03     1.04     1.04     1.02     1.04
branch-contains-loose-cold             35.76    35.80    36.15    36.67    35.13    36.29    36.37    36.03    36.70    36.01
branch-contains-loose-warm             33.01    33.62    33.52    33.51    32.41    33.51    33.71    32.10    33.70    31.99
pack-refs-loose                         4.19     4.20     4.25     4.21     4.20     4.21     4.20     4.19     4.24     4.21
branch-packed-cold                      0.79     0.62     0.60     0.66     0.65     0.58     0.68     0.72     0.60     0.61
branch-packed-warm                      0.02     0.02     0.02     0.02     0.02     0.02     0.02     0.02     0.02     0.02
for-each-ref-packed-cold                0.96     0.97     0.97     0.93     0.89     0.92     0.98     0.96     0.92     0.96
for-each-ref-packed-warm                0.26     0.26     0.26     0.26     0.26     0.26     0.26     0.27     0.27     0.27
checkout-packed-cold                   16.14    16.16    16.74     2.04     2.03     2.09     2.06     2.13     2.03     2.00
checkout-packed-warm                    0.17     0.17     0.18     0.19     0.18     0.17     0.27     0.18     0.19     0.18
checkout-orphan-packed                  0.02     0.01     0.02     0.02     0.02     0.02     0.02     0.02     0.02     0.02
checkout-from-detached-packed-cold     16.24    15.96    16.80     1.99     2.06     2.01     2.08     2.10     1.97     1.96
checkout-from-detached-packed-warm     15.04    14.96    15.76     0.77     0.81     0.79     0.83     0.80     0.79     0.80
branch-contains-packed-cold            36.18    36.98    36.92    35.19    34.97    35.09    33.34    33.87    34.27    34.51
branch-contains-packed-warm            35.27    35.12    36.20    33.52    32.76    33.49    33.65    32.96    33.68    32.34
clone-loose-cold                        9.09     9.22     9.15     9.10     9.19     9.03     9.09     9.25     8.96     9.03
clone-loose-warm                        5.57     5.85     5.65     5.55     5.61     5.64     5.65     5.61     5.74     5.59
fetch-nothing-loose                     1.43     1.43     1.44     1.44     1.45     1.45     1.46     1.44     1.44     1.44
pack-refs                               0.08     0.08     0.08     0.08     0.09     0.08     0.09     0.08     0.08     0.08
fetch-nothing-packed                    1.44     1.43     1.44     1.44     1.44     1.44     1.44     1.44     1.44     1.44
clone-packed-cold                       1.35     1.26     1.30     1.32     1.28     1.35     1.38     1.35     1.29     1.21
clone-packed-warm                       0.36     0.35     0.35     0.36     0.36     0.36     0.35     0.36     0.37     0.35
fetch-everything-cold                  30.29    30.01    29.79    29.04    29.84    29.25    29.30    29.26    29.76    29.30
fetch-everything-warm                  26.20    26.04    26.40    25.60    26.22    25.83    25.82    25.85    26.68    25.73
===================================  =======  =======  =======  =======  =======  =======  =======  =======  =======  =======


[0] f696543 (tag: v1.7.6) Git 1.7.6
[1] 703f05a (tag: v1.7.7) Git 1.7.7
[2] 27897d2 (origin/master) Merge remote-tracking branch 'gitster/mh/iterate-refs'
[3] 558b49c is_refname_available(): reimplement using do_for_each_ref_in_list()
[4] 1658397 Store references hierarchically
[5] 5f5a126 get_ref_dir(): add a recursive option
[6] a306af1 get_ref_dir(): read one whole directory before descending into subdirs
[7] fd53cf7 add_ref(): change to take a (struct ref_entry *) as second argument
[8] 9944c7f (origin/testing) read_packed_refs(): keep track of the directory being worked in
[9] cb75c57 (origin/ok, origin/hierarchical-refs, origin/HEAD) refs.c: call clear_cached_ref_cache() from repack_without_ref()


  reply	other threads:[~2011-10-09  5:44 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-09  3:44 Git is not scalable with too many refs/* NAKAMURA Takumi
2011-06-09  6:50 ` Sverre Rabbelier
2011-06-09 15:23   ` Shawn Pearce
2011-06-09 15:52     ` A Large Angry SCM
2011-06-09 15:56       ` Shawn Pearce
2011-06-09 16:26         ` Jeff King
2011-06-10  3:59           ` NAKAMURA Takumi
2011-06-13 22:27             ` Jeff King
2011-06-14  0:17             ` Andreas Ericsson
2011-06-14  0:30               ` Jeff King
2011-06-14  4:41                 ` Junio C Hamano
2011-06-14  7:26                   ` Sverre Rabbelier
2011-06-14 10:02                     ` Johan Herland
2011-06-14 10:34                       ` Sverre Rabbelier
2011-06-14 17:02                       ` Jeff King
2011-06-14 19:20                         ` Shawn Pearce
2011-06-14 19:47                           ` Jeff King
2011-06-14 20:12                             ` Shawn Pearce
2011-09-08 19:53                               ` Martin Fick
2011-09-09  0:52                                 ` Martin Fick
2011-09-09  1:05                                   ` Thomas Rast
2011-09-09  1:13                                     ` Thomas Rast
2011-09-09 15:59                                   ` Jens Lehmann
2011-09-25 20:43                                   ` Martin Fick
2011-09-26 12:41                                     ` Christian Couder
2011-09-26 17:47                                       ` Martin Fick
2011-09-26 18:56                                         ` Christian Couder
2011-09-30 16:41                                           ` Martin Fick
2011-09-30 19:26                                             ` Martin Fick
2011-09-30 21:02                                             ` Martin Fick
2011-09-30 22:06                                               ` Martin Fick
2011-10-01 20:41                                                 ` Junio C Hamano
2011-10-02  5:19                                                   ` Michael Haggerty
2011-10-03  0:46                                                     ` Martin Fick
2011-10-04  8:08                                                       ` Michael Haggerty
2011-10-03 18:12                                                 ` Martin Fick
2011-10-03 19:42                                                   ` Junio C Hamano
2011-10-04  8:16                                                   ` Michael Haggerty
2011-10-08 20:59                                                 ` Martin Fick
2011-10-09  5:43                                                   ` Michael Haggerty [this message]
2011-09-28 19:38                                       ` Martin Fick
2011-09-28 22:10                                         ` Martin Fick
2011-09-29  0:54                                           ` Julian Phillips
2011-09-29  1:37                                             ` Martin Fick
2011-09-29  2:19                                               ` Julian Phillips
2011-09-29 16:38                                                 ` Martin Fick
2011-09-29 18:26                                                   ` Julian Phillips
2011-09-29 18:27                                                 ` René Scharfe
2011-09-29 19:10                                                   ` Junio C Hamano
2011-09-29  4:18                                                     ` [PATCH] refs: Use binary search to lookup refs faster Julian Phillips
2011-09-29 21:57                                                       ` Junio C Hamano
2011-09-29 22:04                                                       ` [PATCH v2] " Julian Phillips
2011-09-29 22:06                                                       ` [PATCH] " Junio C Hamano
2011-09-29 22:11                                                         ` [PATCH v3] " Julian Phillips
2011-09-29 23:48                                                           ` Junio C Hamano
2011-09-30 15:30                                                             ` Michael Haggerty
2011-09-30 16:38                                                               ` Junio C Hamano
2011-09-30 17:56                                                                 ` [PATCH] refs: Remove duplicates after sorting with qsort Julian Phillips
2011-10-02  5:15                                                                 ` [PATCH v3] refs: Use binary search to lookup refs faster Michael Haggerty
2011-10-02  5:45                                                                   ` Junio C Hamano
2011-10-04 20:58                                                                     ` Junio C Hamano
2011-09-30  1:13                                                           ` Martin Fick
2011-09-30  3:44                                                             ` Junio C Hamano
2011-09-30  8:04                                                               ` Julian Phillips
2011-09-30 15:45                                                               ` Martin Fick
2011-09-29 20:44                                                     ` Git is not scalable with too many refs/* Martin Fick
2011-09-29 19:10                                                   ` Julian Phillips
2011-09-29 20:11                                                   ` Martin Fick
2011-09-30  9:12                                                     ` René Scharfe
2011-09-30 16:09                                                       ` Martin Fick
2011-09-30 16:52                                                       ` Junio C Hamano
2011-09-30 18:17                                                         ` René Scharfe
2011-10-01 15:28                                                           ` René Scharfe
2011-10-01 15:38                                                             ` [PATCH 1/8] checkout: check for "Previous HEAD" notice in t2020 René Scharfe
2011-10-01 19:02                                                               ` Sverre Rabbelier
2011-10-01 15:43                                                             ` [PATCH 2/8] revision: factor out add_pending_sha1 René Scharfe
2011-10-01 15:51                                                             ` [PATCH 3/8] checkout: use add_pending_{object,sha1} in orphan check René Scharfe
2011-10-01 15:56                                                             ` [PATCH 4/8] revision: add leak_pending flag René Scharfe
2011-10-01 16:01                                                             ` [PATCH 5/8] bisect: use " René Scharfe
2011-10-01 16:02                                                             ` [PATCH 6/8] bundle: " René Scharfe
2011-10-01 16:09                                                             ` [PATCH 7/8] checkout: " René Scharfe
2011-10-01 16:16                                                             ` [PATCH 8/8] commit: factor out clear_commit_marks_for_object_array René Scharfe
2011-09-26 15:15                                     ` Git is not scalable with too many refs/* Martin Fick
2011-09-26 15:21                                       ` Sverre Rabbelier
2011-09-26 15:48                                         ` Martin Fick
2011-09-26 15:56                                           ` Sverre Rabbelier
2011-09-26 16:38                                             ` Martin Fick
2011-09-26 16:49                                               ` Julian Phillips
2011-09-26 18:07                                       ` Martin Fick
2011-09-26 18:37                                         ` Julian Phillips
2011-09-26 20:01                                           ` Martin Fick
2011-09-26 20:07                                             ` Junio C Hamano
2011-09-26 20:28                                             ` Julian Phillips
2011-09-26 21:39                                               ` Martin Fick
2011-09-26 21:52                                                 ` Martin Fick
2011-09-26 23:26                                                   ` Julian Phillips
2011-09-26 23:37                                                     ` David Michael Barr
2011-09-27  1:01                                                       ` [PATCH] refs.c: Fix slowness with numerous loose refs David Barr
2011-09-27  2:04                                                         ` David Michael Barr
2011-09-26 23:38                                                     ` Git is not scalable with too many refs/* Junio C Hamano
2011-09-27  0:00                                                       ` [PATCH] Don't sort ref_list too early Julian Phillips
2011-10-02  4:58                                                         ` Michael Haggerty
2011-09-27  0:12                                                     ` Git is not scalable with too many refs/* Martin Fick
2011-09-27  0:22                                                       ` Julian Phillips
2011-09-27  2:34                                                         ` Martin Fick
2011-09-27  7:59                                                           ` Julian Phillips
2011-09-27  8:20                                                     ` Sverre Rabbelier
2011-09-27  9:01                                                       ` Julian Phillips
2011-09-27 10:01                                                         ` Sverre Rabbelier
2011-09-27 10:25                                                           ` Nguyen Thai Ngoc Duy
2011-09-27 11:07                                                         ` Michael Haggerty
2011-09-27 12:10                                                           ` Julian Phillips
2011-09-26 22:30                                                 ` Julian Phillips
2011-09-26 15:32                                     ` Michael Haggerty
2011-09-26 15:42                                       ` Martin Fick
2011-09-26 16:25                                         ` Thomas Rast
2011-09-09 13:50                                 ` Michael Haggerty
2011-09-09 15:51                                   ` Michael Haggerty
2011-09-09 16:03                                   ` Jens Lehmann
2011-06-10  7:41         ` Andreas Ericsson
2011-06-10 19:41           ` Shawn Pearce
2011-06-10 20:12             ` Jakub Narebski
2011-06-10 20:35             ` Jeff King
2011-06-13  7:08             ` Andreas Ericsson
2011-06-09 11:18 ` Jakub Narebski
2011-06-09 15:42   ` Stephen Bash

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E913493.7010101@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=julian@quantumfyre.co.uk \
    --cc=mfick@codeaurora.org \
    --cc=rene.scharfe@lsrfire.ath.cx \
    --cc=trast@student.ethz.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).