git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>
Cc: "Git List" <git@vger.kernel.org>,
	"Andrzej Hunt" <andrzej@ahunt.org>, "Eric Wong" <e@80x24.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Carlo Marcelo Arenas Belón" <carenas@gmail.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>
Subject: Re: [PATCH] hash: reduce size of algo member of object ID
Date: Tue, 5 Oct 2021 19:28:53 +0200	[thread overview]
Message-ID: <269501ed-41ef-9ebf-e56b-1d817a5b9b2e@web.de> (raw)
In-Reply-To: <YVq5XCyLDr0SPBzx@coredump.intra.peff.net>

Am 04.10.21 um 10:20 schrieb Jeff King:
> On Mon, Oct 04, 2021 at 04:13:34AM -0400, Jeff King wrote:
>
>> It looks like adding the "algo" field did make a big difference for the
>> oid_array case, but changing it to a char doesn't seem to help at all:
>>
>>   $ hyperfine -L v none,int,char './git.{v} cat-file --batch-all-objects --batch-check="%(objectname)"'
>>   Benchmark #1: ./git.none cat-file --batch-all-objects --batch-check="%(objectname)"
>>     Time (mean ± σ):      1.653 s ±  0.009 s    [User: 1.607 s, System: 0.046 s]
>>     Range (min … max):    1.640 s …  1.670 s    10 runs
>>
>>   Benchmark #2: ./git.int cat-file --batch-all-objects --batch-check="%(objectname)"
>>     Time (mean ± σ):      1.067 s ±  0.012 s    [User: 1.017 s, System: 0.050 s]
>>     Range (min … max):    1.053 s …  1.089 s    10 runs
>>
>>   Benchmark #3: ./git.char cat-file --batch-all-objects --batch-check="%(objectname)"
>>     Time (mean ± σ):      1.092 s ±  0.013 s    [User: 1.046 s, System: 0.046 s]
>>     Range (min … max):    1.080 s …  1.116 s    10 runs
>>
>>   Summary
>>     './git.int cat-file --batch-all-objects --batch-check="%(objectname)"' ran
>>       1.02 ± 0.02 times faster than './git.char cat-file --batch-all-objects --batch-check="%(objectname)"'
>>       1.55 ± 0.02 times faster than './git.none cat-file --batch-all-objects --batch-check="%(objectname)"'
>>
>> I'm actually surprised it had this much of an impact. But I guess this
>> benchmark really is mostly just memcpy-ing oids into a big array,
>> sorting it, and printing the result. If that array is 12% bigger, we'd
>> expect at least a 12% speedup. But adding in non-linear elements like
>> growing the array (though I guess that is amortized linear) and sorting
>> (which picks up an extra log(n) term) make the difference.
>>
>> It's _kind of_ silly in a sense, since usually you'd ask for other parts
>> of the object, which will make the speed difference relatively smaller.
>> But just dumping a bunch of oids is actually not an unreasonable thing
>> to do. I suspect it got a lot slower with 32-byte GIT_MAX_RAWSZ, too
>> (even when you're using 20-byte sha1), but I don't think there's an easy
>> way to get out of that.
>
> Oh wait, I'm reading it totally wrong. Adding in the extra 4 bytes
> actually made it _faster_ than not having an algo field. Now I'm
> super-confused. I could believe that it gave us some better alignment,
> but the original struct was 32 bytes. 36 seems like a strictly worse
> number.

Apple clang 13 on arm64-apple-darwin20.6.0:

cf0983213c^ (no algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
  Time (mean ± σ):      1.105 s ±  0.017 s    [User: 1.058 s, System: 0.045 s]
  Range (min … max):    1.086 s …  1.126 s    10 runs

cf0983213c (int algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
  Time (mean ± σ):      1.186 s ±  0.015 s    [User: 1.134 s, System: 0.051 s]
  Range (min … max):    1.162 s …  1.201 s    10 runs

cf0983213c (unsigned char algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
  Time (mean ± σ):      1.409 s ±  0.013 s    [User: 1.367 s, System: 0.041 s]
  Range (min … max):    1.397 s …  1.429 s    10 runs


current master (int algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
  Time (mean ± σ):      1.217 s ±  0.018 s    [User: 1.170 s, System: 0.046 s]
  Range (min … max):    1.202 s …  1.246 s    10 runs

current master with the patch (unsigned char algo):
Benchmark #1: ./git -C ../linux cat-file --batch-all-objects --batch-check="%(objectname)"
  Time (mean ± σ):      1.465 s ±  0.012 s    [User: 1.417 s, System: 0.046 s]
  Range (min … max):    1.449 s …  1.480 s    10 runs

OK, I don't like my patch anymore.  Weird, though.

René

  parent reply	other threads:[~2021-10-05 17:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-03  5:51 [PATCH] hash: reduce size of algo member of object ID René Scharfe
2021-10-04  7:31 ` Eric Wong
2021-10-04  8:13 ` Jeff King
2021-10-04  8:20   ` Jeff King
2021-10-05  0:04     ` brian m. carlson
2021-10-05 17:28     ` René Scharfe [this message]
2021-10-04  8:47   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=269501ed-41ef-9ebf-e56b-1d817a5b9b2e@web.de \
    --to=l.s.r@web.de \
    --cc=andrzej@ahunt.org \
    --cc=carenas@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).