From: Jeff King <peff@peff.net>
To: "René Scharfe" <l.s.r@web.de>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 3/4] ls-files: use strbuf_add_uint()
Date: Tue, 12 May 2026 15:01:05 -0400 [thread overview]
Message-ID: <20260512190105.GE70851@coredump.intra.peff.net> (raw)
In-Reply-To: <20260512115603.80780-4-l.s.r@web.de>
On Tue, May 12, 2026 at 01:56:02PM +0200, René Scharfe wrote:
> Speed up printing of objectsize values by using the specialized function
> strbuf_add_uint() as well as strbuf_insert() for padding instead of the
> general-purpose function strbuf_addf(). Here are the numbers I get when
> listing files in the Linux kernel repo:
>
> Benchmark 1: ./git_main -C ../linux ls-files --format='%(objectsize)'
> Time (mean ± σ): 257.3 ms ± 0.4 ms [User: 197.4 ms, System: 56.7 ms]
> Range (min … max): 256.7 ms … 258.1 ms 11 runs
>
> Benchmark 2: ./git -C ../linux ls-files --format='%(objectsize)'
> Time (mean ± σ): 253.4 ms ± 0.3 ms [User: 193.6 ms, System: 56.6 ms]
> Range (min … max): 253.0 ms … 253.8 ms 11 runs
OK, so here the improvement is less impressive than the previous commit.
And the code is...
> {
> + static const char padding[] = " ";
> + size_t min_len = padded ? strlen(padding) : 0;
> + size_t orig_len = line->len;
> + size_t len;
> +
> if (type == OBJ_BLOB) {
> unsigned long size;
> if (odb_read_object_info(repo->objects, oid, &size) < 0)
> die(_("could not get object info about '%s'"),
> oid_to_hex(oid));
> - if (padded)
> - strbuf_addf(line, "%7"PRIuMAX, (uintmax_t)size);
> - else
> - strbuf_addf(line, "%"PRIuMAX, (uintmax_t)size);
> - } else if (padded) {
> - strbuf_addf(line, "%7s", "-");
> + strbuf_add_uint(line, size);
> } else {
> strbuf_addstr(line, "-");
> }
> + len = line->len - orig_len;
> + if (len < min_len)
> + strbuf_insert(line, orig_len, padding, min_len - len);
> }
...also less nice. We are formatting into the strbuf, and then maybe
memmove()-ing the result to accommodate padding. I wonder how much that
affects the timing. It's extra shuffling, but memmove() etc is often
surprisingly fast.
I wonder how bad it would be to handle the padding ahead of time.
Obviously strbuf_add_uint() knows the size of the result right before it
calls memcpy(), and it could insert the padding then. But adding a
padding length parameter (let alone the space vs "0" decision) to that
function feels kind of gross.
In the earlier patch I raised the notion of pre-computing the output
length. If we had a helper to do that, it would be pretty easy to do:
/* noop if third parameter is negative */
strbuf_pad(line, ' ', 7 - decimal_digits(size));
strbuf_add_uint(line, size);
You could also imagine a world where we had some stateful formatting
system, and you could say:
strbuf_pad_next(line, ' ', 7);
strbuf_add_uint(line, size);
but somebody has to store that state between the calls, and I don't love
the idea of bloating strbuf with it. So probably you have some
"formatter" struct, and it operates on a strbuf. And now we have all of
the OO boilerplate hassles like initializing and tearing down our
formatter object. ;) So probably not worth it for this triviality.
Having it all in one string ("%7d") is nice and concise.
I have often wondered how hard it would be to implement our own
vsnprintf(), and whether we could do better than the libc ones. It would
be nice to be able to add shorthands for common types (instead of the
unreadable PRIuMAX mess), as well as custom ones (e.g., hex oids).
Probably also not worth the headache. ;)
-Peff
next prev parent reply other threads:[~2026-05-12 19:01 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 11:55 [PATCH 0/4] strbuf: add and use strbuf_add_uint() René Scharfe
2026-05-12 11:56 ` [PATCH 1/4] strbuf: add strbuf_add_uint() René Scharfe
2026-05-12 18:42 ` Jeff King
2026-05-12 19:32 ` René Scharfe
2026-05-13 16:22 ` Jeff King
2026-05-13 16:47 ` Jeff King
2026-05-13 16:49 ` Jeff King
2026-05-14 11:09 ` René Scharfe
2026-05-14 11:53 ` Junio C Hamano
2026-05-15 3:53 ` Jeff King
2026-05-13 17:46 ` René Scharfe
2026-05-12 11:56 ` [PATCH 2/4] cat-file: use strbuf_add_uint() René Scharfe
2026-05-12 18:46 ` Jeff King
2026-05-12 11:56 ` [PATCH 3/4] ls-files: " René Scharfe
2026-05-12 19:01 ` Jeff King [this message]
2026-05-12 20:44 ` René Scharfe
2026-05-13 16:46 ` Jeff King
2026-05-12 11:56 ` [PATCH 4/4] ls-tree: " René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260512190105.GE70851@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=l.s.r@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox