From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 889A3CD68F5 for ; Tue, 10 Oct 2023 07:21:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442459AbjJJHVd (ORCPT ); Tue, 10 Oct 2023 03:21:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442431AbjJJHVa (ORCPT ); Tue, 10 Oct 2023 03:21:30 -0400 Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7553BFB for ; Tue, 10 Oct 2023 00:21:28 -0700 (PDT) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id D987E5C007F; Tue, 10 Oct 2023 03:21:27 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Tue, 10 Oct 2023 03:21:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm2; t=1696922487; x=1697008887; bh=Em gppE18R2eyiu4nZt3FCcj78knxL5emURYESGIdkoI=; b=KW311SDNGDauYdK+dG hRydBRGGv8zUrtkGzU1Jw6psX3E/mZ8J02Ra0c4Tn43m6OJjEpwkFqQ8eyYZf8jN 4Ehr0HHcKWKE5LLTSC9vnGnlAjNjPU+v4tLawXRXrZe3jaD26YtobBWclJ0rwW9m 7ywqN/sTWcK57lcicT3w6tekckgXapEJr/C/MAtnROWARp/PWaJcoJ1BJfC2Xcz+ zDELqw4ECbq0yw8sVQrGW3AehbTmvS5KFrivzTZ+rXVSaYa8iKVLw/jrkjJjQ7Cs 1DC+BADGjKwHVbzh4TgaTGV8bnN4sZ0s6pgGetP69wfkR4rdRkP2mwV91eDMYxjg Qxdg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1696922487; x=1697008887; bh=EmgppE18R2eyi u4nZt3FCcj78knxL5emURYESGIdkoI=; b=UzmDN0J5+EgNnPRWj6yfv2i4E9dnI BcsZL516kSasNGByMGRfJrYDjiKk/G0TOmrG/nPQgHm/Ae+WUKjjD9EYkjMG0EEm eN4w5Xsm5RT1H3FIQ73Q7j4Qxo/OrvnhzuHcr07O80uWbzdIvKwTFOqg/XRty/4C 38s/N13I9WO1Hgovjhe6+6VPPoeTpAzvk9hTsb6Je8vz562GMaWvwG3U23BDaAga RtK6BnKSPPj0szBQh0McVb3K2qYK6p4oT6/T/q1DA9n9VxEwC264JNMNMkMiehnM cwAD3q2YbF1BJf4aR2Ioo+xHzW2lzHeZ8SPisyeTK3gsERnydS8QjEnuw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrheeggdduudekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepteeuvefhhfdufedvgeeiueeileegtdfhgeeftdeuveejjedtgfejhedujeeutddu necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 10 Oct 2023 03:21:26 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id c37cee2b (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 10 Oct 2023 07:21:26 +0000 (UTC) Date: Tue, 10 Oct 2023 09:21:24 +0200 From: Patrick Steinhardt To: Victoria Dye Cc: Victoria Dye via GitGitGadget , git@vger.kernel.org Subject: Re: [PATCH 0/4] Performance improvement & cleanup in loose ref iteration Message-ID: References: <28ae03f5-7091-d3f3-8a70-56aba6639640@github.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="ejX2P05oPros+0cI" Content-Disposition: inline In-Reply-To: <28ae03f5-7091-d3f3-8a70-56aba6639640@github.com> Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org --ejX2P05oPros+0cI Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 09, 2023 at 02:49:14PM -0700, Victoria Dye wrote: > Patrick Steinhardt wrote: > > On Fri, Oct 06, 2023 at 06:09:25PM +0000, Victoria Dye via GitGitGadget= wrote: > >> While investigating ref iteration performance in builtins like > >> 'for-each-ref' and 'show-ref', I found two small improvement opportuni= ties. > >> > >> The first patch tweaks the logic around prefix matching in > >> 'cache_ref_iterator_advance' so that we correctly skip refs that do not > >> actually match a given prefix. The unnecessary iteration doesn't seem = to be > >> causing any bugs in the ref iteration commands that I've tested, but it > >> doesn't hurt to be more precise (and it helps with some other patches = I'm > >> working on ;) ). > >> > >> The next three patches update how 'loose_fill_ref_dir' determines the = type > >> of ref cache entry to create (directory or regular). On platforms that > >> include d_type information in 'struct dirent' (as far as I can tell, a= ll > >> except NonStop & certain versions of Cygwin), this allows us to skip c= alling > >> 'stat'. In ad-hoc testing, this improved performance of 'git for-each-= ref' > >> by about 20%. > >=20 > > I've done a small set of benchmarks with my usual test repositories, > > which is linux.git with a bunch of references added. The repository > > comes in four sizes: > >=20 > > - small: 50k references > > - medium: 500k references > > - high: 1.1m references > > - huge: 12m references > >=20 > > Unfortunately, I couldn't really reproduce the performance improvements. > > In fact, the new version runs consistently a tiny bit slower than the > > old version: > >=20 > > # Old version, which is 3a06386e31 (The fifteenth batch, 2023-10-04= ). > >=20 > > Benchmark 1: git for-each-ref (revision=3Dold,refcount=3Dsmall) > > Time (mean =C2=B1 =CF=83): 135.5 ms =C2=B1 1.2 ms [User:= 76.4 ms, System: 59.0 ms] > > Range (min =E2=80=A6 max): 134.8 ms =E2=80=A6 136.9 ms 3 runs > >=20 > > Benchmark 2: git for-each-ref (revision=3Dold,refcount=3Dmedium) > > Time (mean =C2=B1 =CF=83): 822.7 ms =C2=B1 2.2 ms [User:= 697.4 ms, System: 125.1 ms] > > Range (min =E2=80=A6 max): 821.1 ms =E2=80=A6 825.2 ms 3 runs > >=20 > > Benchmark 3: git for-each-ref (revision=3Dold,refcount=3Dhigh) > > Time (mean =C2=B1 =CF=83): 1.960 s =C2=B1 0.015 s [User:= 1.702 s, System: 0.257 s] > > Range (min =E2=80=A6 max): 1.944 s =E2=80=A6 1.973 s 3 runs > >=20 > > # New version, which is your tip. > >=20 > > Benchmark 4: git for-each-ref (revision=3Dold,refcount=3Dhuge) > > Time (mean =C2=B1 =CF=83): 16.815 s =C2=B1 0.054 s [User:= 15.091 s, System: 1.722 s] > > Range (min =E2=80=A6 max): 16.760 s =E2=80=A6 16.869 s 3 runs > >=20 > > Benchmark 5: git for-each-ref (revision=3Dnew,refcount=3Dsmall) > > Time (mean =C2=B1 =CF=83): 136.0 ms =C2=B1 0.2 ms [User:= 78.8 ms, System: 57.1 ms] > > Range (min =E2=80=A6 max): 135.8 ms =E2=80=A6 136.2 ms 3 runs > >=20 > > Benchmark 6: git for-each-ref (revision=3Dnew,refcount=3Dmedium) > > Time (mean =C2=B1 =CF=83): 830.4 ms =C2=B1 21.2 ms [User:= 691.3 ms, System: 138.7 ms] > > Range (min =E2=80=A6 max): 814.2 ms =E2=80=A6 854.5 ms 3 runs > >=20 > > Benchmark 7: git for-each-ref (revision=3Dnew,refcount=3Dhigh) > > Time (mean =C2=B1 =CF=83): 1.966 s =C2=B1 0.013 s [User:= 1.717 s, System: 0.249 s] > > Range (min =E2=80=A6 max): 1.952 s =E2=80=A6 1.978 s 3 runs > >=20 > > Benchmark 8: git for-each-ref (revision=3Dnew,refcount=3Dhuge) > > Time (mean =C2=B1 =CF=83): 16.945 s =C2=B1 0.037 s [User:= 15.182 s, System: 1.760 s] > > Range (min =E2=80=A6 max): 16.910 s =E2=80=A6 16.983 s 3 runs > >=20 > > Summary > > git for-each-ref (revision=3Dold,refcount=3Dsmall) ran > > 1.00 =C2=B1 0.01 times faster than git for-each-ref (revision= =3Dnew,refcount=3Dsmall) > > 6.07 =C2=B1 0.06 times faster than git for-each-ref (revision= =3Dold,refcount=3Dmedium) > > 6.13 =C2=B1 0.17 times faster than git for-each-ref (revision= =3Dnew,refcount=3Dmedium) > > 14.46 =C2=B1 0.17 times faster than git for-each-ref (revision= =3Dold,refcount=3Dhigh) > > 14.51 =C2=B1 0.16 times faster than git for-each-ref (revision= =3Dnew,refcount=3Dhigh) > > 124.09 =C2=B1 1.15 times faster than git for-each-ref (revision= =3Dold,refcount=3Dhuge) > > 125.05 =C2=B1 1.12 times faster than git for-each-ref (revision= =3Dnew,refcount=3Dhuge) > >=20 > > The performance regression isn't all that concerning, but it makes me > > wonder why I see things becoming slower rather than faster. My guess is > > that this is because all my test repositories are well-packed and don't > > have a lot of loose references. But I just wanted to confirm how you > > benchmarked your change and what the underlying shape of your test repo > > was. >=20 > I ran my benchmark on my (Intel) Mac with a test repository (single commi= t, > one file) containing: >=20 > - 10k refs/heads/ references > - 10k refs/tags/ references > - 10k refs/special/ references=20 >=20 > All refs in the repository are loose. My Mac has historically been somewh= at > slow and inconsistent when it comes to perf testing, though, so I re-ran = the > benchmark a bit more formally on an Ubuntu VM (3 warmup iterations follow= ed > by at least 10 iterations per test): >=20 > --- >=20 > Benchmark 1: git for-each-ref (revision=3Dold,refcount=3D3k) > Time (mean =C2=B1 =CF=83): 40.6 ms =C2=B1 3.9 ms [User: 13.2 = ms, System: 27.1 ms] > Range (min =E2=80=A6 max): 37.2 ms =E2=80=A6 59.1 ms 76 runs > =20 > Warning: Statistical outliers were detected. Consider re-running this b= enchmark on a quiet system without any interferences from other programs. I= t might help to use the '--warmup' or '--prepare' options. > =20 > Benchmark 2: git for-each-ref (revision=3Dnew,refcount=3D3k) > Time (mean =C2=B1 =CF=83): 38.7 ms =C2=B1 4.4 ms [User: 13.8 = ms, System: 24.5 ms] > Range (min =E2=80=A6 max): 35.1 ms =E2=80=A6 57.2 ms 71 runs > =20 > Warning: Statistical outliers were detected. Consider re-running this b= enchmark on a quiet system without any interferences from other programs. I= t might help to use the '--warmup' or '--prepare' options. > =20 > Benchmark 3: git for-each-ref (revision=3Dold,refcount=3D30k) > Time (mean =C2=B1 =CF=83): 419.4 ms =C2=B1 43.9 ms [User: 136.4= ms, System: 274.1 ms] > Range (min =E2=80=A6 max): 385.1 ms =E2=80=A6 528.7 ms 10 runs > =20 > Benchmark 4: git for-each-ref (revision=3Dnew,refcount=3D30k) > Time (mean =C2=B1 =CF=83): 390.4 ms =C2=B1 27.2 ms [User: 133.1= ms, System: 251.6 ms] > Range (min =E2=80=A6 max): 360.3 ms =E2=80=A6 447.6 ms 10 runs > =20 > Benchmark 5: git for-each-ref (revision=3Dold,refcount=3D300k) > Time (mean =C2=B1 =CF=83): 4.171 s =C2=B1 0.052 s [User: 1.400= s, System: 2.715 s] > Range (min =E2=80=A6 max): 4.118 s =E2=80=A6 4.283 s 10 runs > =20 > Benchmark 6: git for-each-ref (revision=3Dnew,refcount=3D300k) > Time (mean =C2=B1 =CF=83): 3.939 s =C2=B1 0.054 s [User: 1.403= s, System: 2.466 s] > Range (min =E2=80=A6 max): 3.858 s =E2=80=A6 4.026 s 10 runs > =20 > Summary > 'git for-each-ref (revision=3Dnew,refcount=3D3k)' ran > 1.05 =C2=B1 0.16 times faster than 'git for-each-ref (revision=3Dold,= refcount=3D3k)' > 10.08 =C2=B1 1.34 times faster than 'git for-each-ref (revision=3Dnew,= refcount=3D30k)' > 10.83 =C2=B1 1.67 times faster than 'git for-each-ref (revision=3Dold,= refcount=3D30k)' > 101.68 =C2=B1 11.63 times faster than 'git for-each-ref (revision=3Dnew= ,refcount=3D300k)' > 107.67 =C2=B1 12.30 times faster than 'git for-each-ref (revision=3Dold= ,refcount=3D300k)' >=20 > --- >=20 > So it's not the 20% speedup I saw on my local test repo (it's more like > 5-8%), but there does appear to be a consistent improvement. Thanks a bunch for re-doing the benchmark with a documented setup. > As for your results, the changes in this series shouldn't affect > packed ref operations, and the difference between old & new doesn't > seem to indicate a regression.=20 Yeah, I've also been surprised to see the performance regression for packed-refs files. The regression is persistent and reproducable on my machine though, so even though I tend to agree that the patches shouldn't negatively impact packed-refs performance they somehow do. It could just as well be something like different optimization choices by the compler due to the added patches, or hitting different cache lines. I dunno. Anyway, I agree with your assessment. The regression I see is less than 1% for packed-refs, while the improvements for loose refs are a lot more significant and conceptually make a lot of sense. So I didn't intend to say that we shouldn't do these optimizations because of the miniscule peformance regression with packed-refs. Or in other words: this series looks good to me. Thanks! Patrick --ejX2P05oPros+0cI Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF9hrgiFbCdvenl/rVbJhu7ckPpQFAmUk+3QACgkQVbJhu7ck PpTo3Q/5AY336t+OBCSJDSfPjy0hVBU9BQFBL0E0y6m87oy0mkF+0K6SbdaNy0TM twRBxuGGCLjH7K1bVxHl2UZIbAOsqunyYCjQe92YhiGj6iir+YIbdKfAxP8niIRD //nnWJpmhRY8A1mlNCqll2GToJXZC/Tx7WBA+x5PgnHZVLcaYLuhf+ojpBzbbQit 01nyaUvHU5KZQ91st+GrXDn3zvJ0expei9g0yK/MH5Kb0GvTLKFxoFcALsHQ6R1h YicBtG8Ium0GlcCRIM7sNfIqjV+/vadvwneg8J7/EfvS93B2zp18YsH2HC5p2a/R butE7B0FuuGRrLVDgBhe9wUw5jOpztsGnGY6q0LNJKd34TbKgkRiF1tVcflYW1yD hjc3kC57SQR/jl142mjJbVLHyC9/pjwBCnv0/VYwM1v7U1gAeQTXJ0oV/Mw2rTZj ffc8KbdZL74+ul0131bi8y75KcGwFFcMUtHNf2IShUtPVLQO4Zk+bmNPRTt/acY3 JVLtODD9rMATlePW0FxSd8w8kU3JbQJ+JNCkwqUytmdwr0I4ezxo2TkhRRjJcEnw p9FGOFE2e2M68AvNt50EdWxr1p0gzKo4mQ6V2pf73wJEqAi7Zd7tRLd4vD1fkXjd aB141s96MTeYUCHDt5ttRI666l5qpQ/g7WSfR4yY7Gxg6c+GyYk= =mpnw -----END PGP SIGNATURE----- --ejX2P05oPros+0cI--