Git development
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: Jeff King <peff@peff.net>,  Git List <git@vger.kernel.org>
Subject: Re: [PATCH] cat-file: speed up default format
Date: Tue, 16 Jun 2026 15:21:24 -0700	[thread overview]
Message-ID: <xmqqo6ha15zv.fsf@gitster.g> (raw)
In-Reply-To: <47e3cf16-217e-45d4-91e2-5a1abb4ee49e@web.de> ("René Scharfe"'s message of "Wed, 17 Jun 2026 00:14:37 +0200")

René Scharfe <l.s.r@web.de> writes:

> On 6/16/26 1:12 PM, Jeff King wrote:
>> On Mon, Jun 15, 2026 at 11:53:07PM +0200, René Scharfe wrote:
>> 
>>> We can also store literal characters in there.  An opcode plus with a
>>> payload char incurs an overhead of 50%, which sounds high, but at least
>>> the default format only has two of them and it's much better than
>>> storing pointer plus size for an overhead of more than 90% in case of a
>>> single char.
>> 
>> True, and it's a size win if the literal portions tend to be small
>> (fewer than 15 bytes). You do lose out on the ability to strbuf_add()
>> them in one go, though. So lots more strbuf_grow() checks, etc. If you
>> really wanted to get fancy, you could follow the opcode with a length
>> represented as a variable-sized integer, followed by the literal bytes.
>
> Or an opcode that shovels a fixed-size string.  Depends on how much
> literal text people include in their formats.
>
>> I'm not sure that Git's formatting code needs to squeeze out quite that
>> much performance, though.
>
> Good point.  Near-native performance would be necessary to make peephole
> optimizations like the special handling of the default format
> unnecessary, which I understand exists to speed up Gitaly [1], but I
> guess most users don't have such high demands.  And there's no point in
> removing a few lines of duplicate code if the necessary machinery adds a
> lot of complexity.  Though the code discussed so far was not too crazy
> IMHO.
>
> [1] https://gitlab.com/gitlab-org/gitaly/-/blob/master/internal/git/gitpipe/catfile_info.go
>
>> OK, so we managed another 1%. But I'm skeptical that this linear opcode
>> technique is where we want to go in the long run, if we're ever going to
>> unify formatters.
>
> Agreed.
>
> René

Obviously I agree with the conclusion, but it was fun to watch cute
experiments from the sidelines.  Thanks for entertainment ;-)

  reply	other threads:[~2026-06-16 22:21 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-14 16:28 [PATCH] cat-file: speed up default format René Scharfe
2026-06-15  7:27 ` Patrick Steinhardt
2026-06-15 16:53 ` Jeff King
2026-06-15 17:06   ` Jeff King
2026-06-15 21:53     ` René Scharfe
2026-06-16 11:12       ` Jeff King
2026-06-16 22:14         ` René Scharfe
2026-06-16 22:21           ` Junio C Hamano [this message]
2026-06-15 21:53   ` René Scharfe
2026-06-16 11:15     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo6ha15zv.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox