Git development
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>
Cc: Git List <git@vger.kernel.org>
Subject: Re: [PATCH] cat-file: speed up default format
Date: Wed, 17 Jun 2026 00:14:37 +0200	[thread overview]
Message-ID: <47e3cf16-217e-45d4-91e2-5a1abb4ee49e@web.de> (raw)
In-Reply-To: <20260616111237.GA687438@coredump.intra.peff.net>

On 6/16/26 1:12 PM, Jeff King wrote:
> On Mon, Jun 15, 2026 at 11:53:07PM +0200, René Scharfe wrote:
> 
>> We can also store literal characters in there.  An opcode plus with a
>> payload char incurs an overhead of 50%, which sounds high, but at least
>> the default format only has two of them and it's much better than
>> storing pointer plus size for an overhead of more than 90% in case of a
>> single char.
> 
> True, and it's a size win if the literal portions tend to be small
> (fewer than 15 bytes). You do lose out on the ability to strbuf_add()
> them in one go, though. So lots more strbuf_grow() checks, etc. If you
> really wanted to get fancy, you could follow the opcode with a length
> represented as a variable-sized integer, followed by the literal bytes.

Or an opcode that shovels a fixed-size string.  Depends on how much
literal text people include in their formats.

> I'm not sure that Git's formatting code needs to squeeze out quite that
> much performance, though.

Good point.  Near-native performance would be necessary to make peephole
optimizations like the special handling of the default format
unnecessary, which I understand exists to speed up Gitaly [1], but I
guess most users don't have such high demands.  And there's no point in
removing a few lines of duplicate code if the necessary machinery adds a
lot of complexity.  Though the code discussed so far was not too crazy
IMHO.

[1] https://gitlab.com/gitlab-org/gitaly/-/blob/master/internal/git/gitpipe/catfile_info.go

> OK, so we managed another 1%. But I'm skeptical that this linear opcode
> technique is where we want to go in the long run, if we're ever going to
> unify formatters.

Agreed.

René


  reply	other threads:[~2026-06-16 22:14 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-14 16:28 [PATCH] cat-file: speed up default format René Scharfe
2026-06-15  7:27 ` Patrick Steinhardt
2026-06-15 16:53 ` Jeff King
2026-06-15 17:06   ` Jeff King
2026-06-15 21:53     ` René Scharfe
2026-06-16 11:12       ` Jeff King
2026-06-16 22:14         ` René Scharfe [this message]
2026-06-16 22:21           ` Junio C Hamano
2026-06-15 21:53   ` René Scharfe
2026-06-16 11:15     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47e3cf16-217e-45d4-91e2-5a1abb4ee49e@web.de \
    --to=l.s.r@web.de \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox