From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>
Cc: "Git List" <git@vger.kernel.org>,
"Andrzej Hunt" <andrzej@ahunt.org>, "Eric Wong" <e@80x24.org>,
"Junio C Hamano" <gitster@pobox.com>,
"Carlo Marcelo Arenas Belón" <carenas@gmail.com>,
"brian m. carlson" <sandals@crustytoothpaste.net>
Subject: Re: [PATCH] hash: reduce size of algo member of object ID
Date: Tue, 5 Oct 2021 19:28:53 +0200 [thread overview]
Message-ID: <269501ed-41ef-9ebf-e56b-1d817a5b9b2e@web.de> (raw)
In-Reply-To: <YVq5XCyLDr0SPBzx@coredump.intra.peff.net>
Am 04.10.21 um 10:20 schrieb Jeff King:
> On Mon, Oct 04, 2021 at 04:13:34AM -0400, Jeff King wrote:
>
>> It looks like adding the "algo" field did make a big difference for the
>> oid_array case, but changing it to a char doesn't seem to help at all:
>>
>> $ hyperfine -L v none,int,char './git.{v} cat-file --batch-all-objects --batch-check="%(objectname)"'
>> Benchmark #1: ./git.none cat-file --batch-all-objects --batch-check="%(objectname)"
>> Time (mean ± σ): 1.653 s ± 0.009 s [User: 1.607 s, System: 0.046 s]
>> Range (min … max): 1.640 s … 1.670 s 10 runs
>>
>> Benchmark #2: ./git.int cat-file --batch-all-objects --batch-check="%(objectname)"
>> Time (mean ± σ): 1.067 s ± 0.012 s [User: 1.017 s, System: 0.050 s]
>> Range (min … max): 1.053 s … 1.089 s 10 runs
>>
>> Benchmark #3: ./git.char cat-file --batch-all-objects --batch-check="%(objectname)"
>> Time (mean ± σ): 1.092 s ± 0.013 s [User: 1.046 s, System: 0.046 s]
>> Range (min … max): 1.080 s … 1.116 s 10 runs
>>
>> Summary
>> './git.int cat-file --batch-all-objects --batch-check="%(objectname)"' ran
>> 1.02 ± 0.02 times faster than './git.char cat-file --batch-all-objects --batch-check="%(objectname)"'
>> 1.55 ± 0.02 times faster than './git.none cat-file --batch-all-objects --batch-check="%(objectname)"'
>>
>> I'm actually surprised it had this much of an impact. But I guess this
>> benchmark really is mostly just memcpy-ing oids into a big array,
>> sorting it, and printing the result. If that array is 12% bigger, we'd
>> expect at least a 12% speedup. But adding in non-linear elements like
>> growing the array (though I guess that is amortized linear) and sorting
>> (which picks up an extra log(n) term) make the difference.
>>
>> It's _kind of_ silly in a sense, since usually you'd ask for other parts
>> of the object, which will make the speed difference relatively smaller.
>> But just dumping a bunch of oids is actually not an unreasonable thing
>> to do. I suspect it got a lot slower with 32-byte GIT_MAX_RAWSZ, too
>> (even when you're using 20-byte sha1), but I don't think there's an easy
>> way to get out of that.
>
> Oh wait, I'm reading it totally wrong. Adding in the extra 4 bytes
> actually made it _faster_ than not having an algo field. Now I'm
> super-confused. I could believe that it gave us some better alignment,
> but the original struct was 32 bytes. 36 seems like a strictly worse
> number.
Apple clang 13 on arm64-apple-darwin20.6.0:
cf0983213c^ (no algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
Time (mean ± σ): 1.105 s ± 0.017 s [User: 1.058 s, System: 0.045 s]
Range (min … max): 1.086 s … 1.126 s 10 runs
cf0983213c (int algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
Time (mean ± σ): 1.186 s ± 0.015 s [User: 1.134 s, System: 0.051 s]
Range (min … max): 1.162 s … 1.201 s 10 runs
cf0983213c (unsigned char algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
Time (mean ± σ): 1.409 s ± 0.013 s [User: 1.367 s, System: 0.041 s]
Range (min … max): 1.397 s … 1.429 s 10 runs
current master (int algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
Time (mean ± σ): 1.217 s ± 0.018 s [User: 1.170 s, System: 0.046 s]
Range (min … max): 1.202 s … 1.246 s 10 runs
current master with the patch (unsigned char algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
Time (mean ± σ): 1.465 s ± 0.012 s [User: 1.417 s, System: 0.046 s]
Range (min … max): 1.449 s … 1.480 s 10 runs
OK, I don't like my patch anymore. Weird, though.
René
next prev parent reply other threads:[~2021-10-05 17:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-03 5:51 [PATCH] hash: reduce size of algo member of object ID René Scharfe
2021-10-04 7:31 ` Eric Wong
2021-10-04 8:13 ` Jeff King
2021-10-04 8:20 ` Jeff King
2021-10-05 0:04 ` brian m. carlson
2021-10-05 17:28 ` René Scharfe [this message]
2021-10-04 8:47 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=269501ed-41ef-9ebf-e56b-1d817a5b9b2e@web.de \
--to=l.s.r@web.de \
--cc=andrzej@ahunt.org \
--cc=carenas@gmail.com \
--cc=e@80x24.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).