[PATCH] cat-file: speed up default format

Git development
 help / color / mirror / Atom feed

* [PATCH] cat-file: speed up default format
@ 2026-06-14 16:28 René Scharfe
  2026-06-15  7:27 ` Patrick Steinhardt
  2026-06-15 16:53 ` Jeff King
  0 siblings, 2 replies; 6+ messages in thread
From: René Scharfe @ 2026-06-14 16:28 UTC (permalink / raw)
  To: Git List

eb54a3391b (cat-file: skip expanding default format, 2022-03-15) added
special handling for the default batch format.  In the meantime it has
fallen behind the code path for handling arbitrary formats.  Bring it up
to speed by using the new and more efficient strbuf_add_oid_hex() and
strbuf_add_uint() instead of strbuf_addf():

Benchmark 1: ./git_main cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
  Time (mean ± σ):      1.051 s ±  0.003 s    [User: 1.027 s, System: 0.023 s]
  Range (min … max):    1.049 s …  1.058 s    10 runs

Benchmark 2: ./git_main cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
  Time (mean ± σ):      1.012 s ±  0.002 s    [User: 0.988 s, System: 0.023 s]
  Range (min … max):    1.010 s …  1.018 s    10 runs

Benchmark 3: ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
  Time (mean ± σ):     979.0 ms ±   1.1 ms    [User: 954.1 ms, System: 23.2 ms]
  Range (min … max):   977.7 ms … 980.8 ms    10 runs

Summary
  ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' ran
    1.03 ± 0.00 times faster than ./git_main cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
    1.07 ± 0.00 times faster than ./git_main cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 builtin/cat-file.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 2b64f8f733..d7f7895e30 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -461,9 +461,12 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
 static void print_default_format(struct strbuf *scratch, struct expand_data *data,
 				 struct batch_options *opt)
 {
-	strbuf_addf(scratch, "%s %s %"PRIuMAX"%c", oid_to_hex(&data->oid),
-		    type_name(data->type),
-		    (uintmax_t)data->size, opt->output_delim);
+	strbuf_add_oid_hex(scratch, &data->oid);
+	strbuf_addch(scratch, ' ');
+	strbuf_addstr(scratch, type_name(data->type));
+	strbuf_addch(scratch, ' ');
+	strbuf_add_uint(scratch, data->size);
+	strbuf_addch(scratch, opt->output_delim);
 }
 
 static void report_object_status(struct batch_options *opt,
-- 
2.54.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] cat-file: speed up default format
  2026-06-14 16:28 [PATCH] cat-file: speed up default format René Scharfe
@ 2026-06-15  7:27 ` Patrick Steinhardt
  2026-06-15 16:53 ` Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Patrick Steinhardt @ 2026-06-15  7:27 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

On Sun, Jun 14, 2026 at 06:28:34PM +0200, René Scharfe wrote:
> eb54a3391b (cat-file: skip expanding default format, 2022-03-15) added
> special handling for the default batch format.  In the meantime it has
> fallen behind the code path for handling arbitrary formats.  Bring it up
> to speed by using the new and more efficient strbuf_add_oid_hex() and
> strbuf_add_uint() instead of strbuf_addf():
> 
> Benchmark 1: ./git_main cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
>   Time (mean ± σ):      1.051 s ±  0.003 s    [User: 1.027 s, System: 0.023 s]
>   Range (min … max):    1.049 s …  1.058 s    10 runs
> 
> Benchmark 2: ./git_main cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
>   Time (mean ± σ):      1.012 s ±  0.002 s    [User: 0.988 s, System: 0.023 s]
>   Range (min … max):    1.010 s …  1.018 s    10 runs
> 
> Benchmark 3: ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
>   Time (mean ± σ):     979.0 ms ±   1.1 ms    [User: 954.1 ms, System: 23.2 ms]
>   Range (min … max):   977.7 ms … 980.8 ms    10 runs
> 
> Summary
>   ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' ran
>     1.03 ± 0.00 times faster than ./git_main cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
>     1.07 ± 0.00 times faster than ./git_main cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'

This almost makes me wonder whether it even makes sense to keep around
the handler for the default format. Is a 3% speedup worth the additional
complexity and the need to keep those sites in sync?

> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 2b64f8f733..d7f7895e30 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -461,9 +461,12 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d
>  static void print_default_format(struct strbuf *scratch, struct expand_data *data,
>  				 struct batch_options *opt)
>  {
> -	strbuf_addf(scratch, "%s %s %"PRIuMAX"%c", oid_to_hex(&data->oid),
> -		    type_name(data->type),
> -		    (uintmax_t)data->size, opt->output_delim);
> +	strbuf_add_oid_hex(scratch, &data->oid);
> +	strbuf_addch(scratch, ' ');
> +	strbuf_addstr(scratch, type_name(data->type));
> +	strbuf_addch(scratch, ' ');
> +	strbuf_add_uint(scratch, data->size);
> +	strbuf_addch(scratch, opt->output_delim);
>  }

The change itself looks obviously good to me though, thanks!

Patrick

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] cat-file: speed up default format
  2026-06-14 16:28 [PATCH] cat-file: speed up default format René Scharfe
  2026-06-15  7:27 ` Patrick Steinhardt
@ 2026-06-15 16:53 ` Jeff King
  2026-06-15 17:06   ` Jeff King
  2026-06-15 21:53   ` René Scharfe
  1 sibling, 2 replies; 6+ messages in thread
From: Jeff King @ 2026-06-15 16:53 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

On Sun, Jun 14, 2026 at 06:28:34PM +0200, René Scharfe wrote:

> eb54a3391b (cat-file: skip expanding default format, 2022-03-15) added
> special handling for the default batch format.  In the meantime it has
> fallen behind the code path for handling arbitrary formats.  Bring it up
> to speed by using the new and more efficient strbuf_add_oid_hex() and
> strbuf_add_uint() instead of strbuf_addf():
> 
> Benchmark 1: ./git_main cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
>   Time (mean ± σ):      1.051 s ±  0.003 s    [User: 1.027 s, System: 0.023 s]
>   Range (min … max):    1.049 s …  1.058 s    10 runs
> 
> Benchmark 2: ./git_main cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
>   Time (mean ± σ):      1.012 s ±  0.002 s    [User: 0.988 s, System: 0.023 s]
>   Range (min … max):    1.010 s …  1.018 s    10 runs
> 
> Benchmark 3: ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
>   Time (mean ± σ):     979.0 ms ±   1.1 ms    [User: 954.1 ms, System: 23.2 ms]
>   Range (min … max):   977.7 ms … 980.8 ms    10 runs

Interesting that it was actually slower than a custom format.  Using the
default format saves the cost of strbuf_expand(), but it was paying the
price of strbuf_addf(), which the custom path no longer used. So the
cost of strbuf_addf() is more than strbuf_expand(), which is not all
that surprising.

Your patch seems obviously right, and everything below is idle
speculation / nerd-sniping.

I have long wondered if we could do better with a separate initial parse
step, which would let us walk the parse tree for each object. In theory
that tree is more compact.

I think it would be a huge improvement for ref-filter, whose parser is
complicated and slow (though its biggest sin is that it allocates a
separate string for each atom before assembling the final output). But
could it help even cat-file, which is using a pretty tight loop over
strbuf_expand()? I sketched out a rough draft below.

It uses per-atom callback functions which is nice and clean, though we
might be able to do even better with a big ugly switch() statement.

The timings I got are below (git.old is master with your patch here
applied, and git.new is my patch on top). It looks like it does make a
custom format ~3% faster. But it's still a shade slower than the default
format. Not sure if it's the extra function calls, or if the static
print_default_format() function gives the compiler more opportunities
for optimization.

  Benchmark 1: ./git.old cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
    Time (mean ± σ):     580.2 ms ±   5.0 ms    [User: 558.7 ms, System: 21.5 ms]
    Range (min … max):   569.9 ms … 585.6 ms    10 runs
  
  Benchmark 2: ./git.new cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
    Time (mean ± σ):     580.4 ms ±   5.1 ms    [User: 562.7 ms, System: 17.8 ms]
    Range (min … max):   571.8 ms … 587.0 ms    10 runs
  
  Benchmark 3: ./git.old cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
    Time (mean ± σ):     618.6 ms ±   5.0 ms    [User: 598.9 ms, System: 19.7 ms]
    Range (min … max):   613.6 ms … 628.3 ms    10 runs
  
  Benchmark 4: ./git.new cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
    Time (mean ± σ):     600.2 ms ±   4.2 ms    [User: 581.2 ms, System: 19.0 ms]
    Range (min … max):   595.2 ms … 608.8 ms    10 runs
  
  Summary
    ./git.old cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' ran
      1.00 ± 0.01 times faster than ./git.new cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
      1.03 ± 0.01 times faster than ./git.new cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
      1.07 ± 0.01 times faster than ./git.old cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'

Patch below, only lightly tested.

---
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index d7f7895e30..9cc7ec7a6f 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -36,6 +36,8 @@ enum batch_mode {
 	BATCH_MODE_QUEUE_AND_DISPATCH,
 };
 
+struct format_item;
+
 struct batch_options {
 	struct list_objects_filter_options objects_filter;
 	int enabled;
@@ -48,6 +50,7 @@ struct batch_options {
 	char input_delim;
 	char output_delim;
 	const char *format;
+	struct format_item *parsed_format;
 };
 
 static const char *force_path;
@@ -294,12 +297,6 @@ struct expand_data {
 	const char *rest;
 	struct object_id delta_base_oid;
 
-	/*
-	 * If mark_query is true, we do not expand anything, but rather
-	 * just mark the object_info with items we wish to query.
-	 */
-	int mark_query;
-
 	/*
 	 * Whether to split the input on whitespace before feeding it to
 	 * get_sha1; this is decided during the mark_query phase based on
@@ -323,65 +320,152 @@ struct expand_data {
 };
 #define EXPAND_DATA_INIT  { .mode = S_IFINVALID }
 
+struct format_item {
+	void (*add)(struct format_item *item, struct strbuf *sb, struct expand_data *data);
+	union {
+		struct {
+			const char *p;
+			size_t len;
+		} literal;
+	} u;
+	/*
+	 * We could make a true tree here with child/next pointers, which would
+	 * be necessary if we had recursive formats, like %(if). But for our
+	 * simple formats for now it is enough to have a linear set of items,
+	 * so we'll just allocate an array and terminate it with a NULL entry.
+	 */
+};
+
+static void objectname_add(struct format_item *item UNUSED,
+			   struct strbuf *sb, struct expand_data *data)
+{
+	strbuf_add_oid_hex(sb, &data->oid);
+}
+
+static void objecttype_add(struct format_item *item UNUSED,
+			   struct strbuf *sb, struct expand_data *data)
+{
+	strbuf_addstr(sb, type_name(data->type));
+}
+
+static void objectsize_add(struct format_item *item UNUSED,
+			   struct strbuf *sb, struct expand_data *data)
+{
+	strbuf_add_uint(sb, data->size);
+}
+
+static void objectsize_disk_add(struct format_item *item UNUSED,
+				struct strbuf *sb, struct expand_data *data)
+{
+	strbuf_add_uint(sb, data->disk_size);
+}
+
+static void rest_add(struct format_item *item UNUSED,
+		     struct strbuf *sb, struct expand_data *data)
+{
+	strbuf_addstr(sb, data->rest);
+}
+
+static void deltabase_add(struct format_item *item UNUSED,
+			  struct strbuf *sb, struct expand_data *data)
+{
+	strbuf_add_oid_hex(sb, &data->delta_base_oid);
+}
+
+static void objectmode_add(struct format_item *item UNUSED,
+			   struct strbuf *sb, struct expand_data *data)
+{
+	if (data->mode != S_IFINVALID)
+		strbuf_addf(sb, "%06o", data->mode);
+}
+
+static void literal_add(struct format_item *item,
+			struct strbuf *sb, struct expand_data *data UNUSED)
+{
+	strbuf_add(sb, item->u.literal.p, item->u.literal.len);
+}
+
 static int is_atom(const char *atom, const char *s, int slen)
 {
 	int alen = strlen(atom);
 	return alen == slen && !memcmp(atom, s, alen);
 }
 
-static int expand_atom(struct strbuf *sb, const char *atom, int len,
-		       struct expand_data *data)
+static int parse_atom(struct format_item *fmt, const char *atom, int len,
+		      struct expand_data *data)
 {
 	if (is_atom("objectname", atom, len)) {
-		if (!data->mark_query)
-			strbuf_add_oid_hex(sb, &data->oid);
+		fmt->add = objectname_add;
 	} else if (is_atom("objecttype", atom, len)) {
-		if (data->mark_query)
-			data->info.typep = &data->type;
-		else
-			strbuf_addstr(sb, type_name(data->type));
+		data->info.typep = &data->type;
+		fmt->add = objecttype_add;
 	} else if (is_atom("objectsize", atom, len)) {
-		if (data->mark_query)
-			data->info.sizep = &data->size;
-		else
-			strbuf_add_uint(sb, data->size);
+		data->info.sizep = &data->size;
+		fmt->add = objectsize_add;
 	} else if (is_atom("objectsize:disk", atom, len)) {
-		if (data->mark_query)
-			data->info.disk_sizep = &data->disk_size;
-		else
-			strbuf_add_uint(sb, data->disk_size);
+		data->info.disk_sizep = &data->disk_size;
+		fmt->add = objectsize_disk_add;
 	} else if (is_atom("rest", atom, len)) {
-		if (data->mark_query)
-			data->split_on_whitespace = 1;
-		else if (data->rest)
-			strbuf_addstr(sb, data->rest);
+		data->split_on_whitespace = 1;
+		fmt->add = rest_add;
 	} else if (is_atom("deltabase", atom, len)) {
-		if (data->mark_query)
-			data->info.delta_base_oid = &data->delta_base_oid;
-		else
-			strbuf_add_oid_hex(sb, &data->delta_base_oid);
+		data->info.delta_base_oid = &data->delta_base_oid;
+		fmt->add = deltabase_add;
 	} else if (is_atom("objectmode", atom, len)) {
-		if (!data->mark_query && !(S_IFINVALID == data->mode))
-			strbuf_addf(sb, "%06o", data->mode);
+		fmt->add = objectmode_add;
 	} else
 		return 0;
 	return 1;
 }
 
-static void expand_format(struct strbuf *sb, const char *start,
-			  struct expand_data *data)
+static struct format_item *parse_format(const char *start,
+					struct expand_data *data)
 {
-	while (strbuf_expand_step(sb, &start)) {
+	struct format_item *ret = NULL;
+	size_t nr = 0, alloc = 0;
+
+	while (1) {
+		const char *percent = strchrnul(start, '%');
 		const char *end;
 
-		if (skip_prefix(start, "%", &start) || *start != '(')
-			strbuf_addch(sb, '%');
-		else if ((end = strchr(start + 1, ')')) &&
-			 expand_atom(sb, start + 1, end - start - 1, data))
+		if (percent != start) {
+			ALLOC_GROW(ret, nr + 1, alloc);
+			ret[nr].add = literal_add;
+			ret[nr].u.literal.p = start;
+			ret[nr].u.literal.len = percent - start;
+			nr++;
+		}
+
+		if (!*percent)
+			break;
+
+		start = percent + 1;
+
+		ALLOC_GROW(ret, nr + 1, alloc);
+		if (skip_prefix(start, "%", &start) || *start != '(') {
+			ret[nr].add = literal_add;
+			ret[nr].u.literal.p = "%";
+			ret[nr].u.literal.len = 1;
+		} else if ((end = strchr(start + 1, ')')) &&
+			   parse_atom(&ret[nr], start + 1, end - start - 1, data)) {
 			start = end + 1;
-		else
+		} else {
 			strbuf_expand_bad_format(start, "cat-file");
+		}
+		nr++;
 	}
+
+	ALLOC_GROW(ret, nr + 1, alloc);
+	ret[nr].add = NULL;
+
+	return ret;
+}
+
+static void expand_format(struct strbuf *sb, struct format_item *fmt,
+			  struct expand_data *data)
+{
+	for (; fmt->add; fmt++)
+		fmt->add(fmt, sb, data);
 }
 
 static void batch_write(struct batch_options *opt, const void *data, int len)
@@ -568,7 +652,7 @@ static void batch_object_write(const char *obj_name,
 	if (!opt->format) {
 		print_default_format(scratch, data, opt);
 	} else {
-		expand_format(scratch, opt->format, data);
+		expand_format(scratch, opt->parsed_format, data);
 		strbuf_addch(scratch, opt->output_delim);
 	}
 
@@ -936,17 +1020,9 @@ static int batch_objects(struct batch_options *opt)
 	int save_warning;
 	int retval = 0;
 
-	/*
-	 * Expand once with our special mark_query flag, which will prime the
-	 * object_info to be handed to odb_read_object_info_extended for each
-	 * object.
-	 */
-	data.mark_query = 1;
-	expand_format(&output,
-		      opt->format ? opt->format : DEFAULT_FORMAT,
-		      &data);
-	data.mark_query = 0;
-	strbuf_release(&output);
+	opt->parsed_format = parse_format(opt->format ?
+					  opt->format : DEFAULT_FORMAT,
+					  &data);
 	if (opt->transform_mode)
 		data.split_on_whitespace = 1;
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] cat-file: speed up default format
  2026-06-15 16:53 ` Jeff King
@ 2026-06-15 17:06   ` Jeff King
  2026-06-15 21:53     ` René Scharfe
  2026-06-15 21:53   ` René Scharfe
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff King @ 2026-06-15 17:06 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List

On Mon, Jun 15, 2026 at 12:53:26PM -0400, Jeff King wrote:

> It uses per-atom callback functions which is nice and clean, though we
> might be able to do even better with a big ugly switch() statement.

Being the curious sort, I swapped it out for a big switch statement.
Patch below, but it does not seem to be any faster.

So the bottom line is I think you could gain a little bit of performance
by pre-parsing (versus strbuf_expand() on each object). Around 3% for
something that actually looks at the objects, though more like 15% if
for just dumping the objectnames.

IMHO that is probably not worth it for a custom parsing system just for
cat-file.  But if we were to finally unify ref-filter and cat-file (and
even --pretty=format) then it would probably worth doing this kind of
pre-parsing.

---
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 9cc7ec7a6f..da6ecc61f9 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -321,7 +321,17 @@ struct expand_data {
 #define EXPAND_DATA_INIT  { .mode = S_IFINVALID }
 
 struct format_item {
-	void (*add)(struct format_item *item, struct strbuf *sb, struct expand_data *data);
+	enum {
+		FORMAT_TYPE_END = 0,
+		FORMAT_TYPE_LITERAL,
+		FORMAT_TYPE_OBJECTNAME,
+		FORMAT_TYPE_OBJECTTYPE,
+		FORMAT_TYPE_OBJECTSIZE,
+		FORMAT_TYPE_OBJECTSIZE_DISK,
+		FORMAT_TYPE_REST,
+		FORMAT_TYPE_DELTABASE,
+		FORMAT_TYPE_OBJECTMODE,
+	} type;
 	union {
 		struct {
 			const char *p;
@@ -336,55 +346,6 @@ struct format_item {
 	 */
 };
 
-static void objectname_add(struct format_item *item UNUSED,
-			   struct strbuf *sb, struct expand_data *data)
-{
-	strbuf_add_oid_hex(sb, &data->oid);
-}
-
-static void objecttype_add(struct format_item *item UNUSED,
-			   struct strbuf *sb, struct expand_data *data)
-{
-	strbuf_addstr(sb, type_name(data->type));
-}
-
-static void objectsize_add(struct format_item *item UNUSED,
-			   struct strbuf *sb, struct expand_data *data)
-{
-	strbuf_add_uint(sb, data->size);
-}
-
-static void objectsize_disk_add(struct format_item *item UNUSED,
-				struct strbuf *sb, struct expand_data *data)
-{
-	strbuf_add_uint(sb, data->disk_size);
-}
-
-static void rest_add(struct format_item *item UNUSED,
-		     struct strbuf *sb, struct expand_data *data)
-{
-	strbuf_addstr(sb, data->rest);
-}
-
-static void deltabase_add(struct format_item *item UNUSED,
-			  struct strbuf *sb, struct expand_data *data)
-{
-	strbuf_add_oid_hex(sb, &data->delta_base_oid);
-}
-
-static void objectmode_add(struct format_item *item UNUSED,
-			   struct strbuf *sb, struct expand_data *data)
-{
-	if (data->mode != S_IFINVALID)
-		strbuf_addf(sb, "%06o", data->mode);
-}
-
-static void literal_add(struct format_item *item,
-			struct strbuf *sb, struct expand_data *data UNUSED)
-{
-	strbuf_add(sb, item->u.literal.p, item->u.literal.len);
-}
-
 static int is_atom(const char *atom, const char *s, int slen)
 {
 	int alen = strlen(atom);
@@ -395,24 +356,24 @@ static int parse_atom(struct format_item *fmt, const char *atom, int len,
 		      struct expand_data *data)
 {
 	if (is_atom("objectname", atom, len)) {
-		fmt->add = objectname_add;
+		fmt->type = FORMAT_TYPE_OBJECTNAME;
 	} else if (is_atom("objecttype", atom, len)) {
 		data->info.typep = &data->type;
-		fmt->add = objecttype_add;
+		fmt->type = FORMAT_TYPE_OBJECTTYPE;
 	} else if (is_atom("objectsize", atom, len)) {
 		data->info.sizep = &data->size;
-		fmt->add = objectsize_add;
+		fmt->type = FORMAT_TYPE_OBJECTSIZE;
 	} else if (is_atom("objectsize:disk", atom, len)) {
 		data->info.disk_sizep = &data->disk_size;
-		fmt->add = objectsize_disk_add;
+		fmt->type = FORMAT_TYPE_OBJECTSIZE_DISK;
 	} else if (is_atom("rest", atom, len)) {
 		data->split_on_whitespace = 1;
-		fmt->add = rest_add;
+		fmt->type = FORMAT_TYPE_REST;
 	} else if (is_atom("deltabase", atom, len)) {
 		data->info.delta_base_oid = &data->delta_base_oid;
-		fmt->add = deltabase_add;
+		fmt->type = FORMAT_TYPE_DELTABASE;
 	} else if (is_atom("objectmode", atom, len)) {
-		fmt->add = objectmode_add;
+		fmt->type = FORMAT_TYPE_OBJECTMODE;
 	} else
 		return 0;
 	return 1;
@@ -430,7 +391,7 @@ static struct format_item *parse_format(const char *start,
 
 		if (percent != start) {
 			ALLOC_GROW(ret, nr + 1, alloc);
-			ret[nr].add = literal_add;
+			ret[nr].type = FORMAT_TYPE_LITERAL;
 			ret[nr].u.literal.p = start;
 			ret[nr].u.literal.len = percent - start;
 			nr++;
@@ -443,7 +404,7 @@ static struct format_item *parse_format(const char *start,
 
 		ALLOC_GROW(ret, nr + 1, alloc);
 		if (skip_prefix(start, "%", &start) || *start != '(') {
-			ret[nr].add = literal_add;
+			ret[nr].type = FORMAT_TYPE_LITERAL;
 			ret[nr].u.literal.p = "%";
 			ret[nr].u.literal.len = 1;
 		} else if ((end = strchr(start + 1, ')')) &&
@@ -456,16 +417,44 @@ static struct format_item *parse_format(const char *start,
 	}
 
 	ALLOC_GROW(ret, nr + 1, alloc);
-	ret[nr].add = NULL;
+	ret[nr].type = FORMAT_TYPE_END;
 
 	return ret;
 }
 
 static void expand_format(struct strbuf *sb, struct format_item *fmt,
 			  struct expand_data *data)
 {
-	for (; fmt->add; fmt++)
-		fmt->add(fmt, sb, data);
+	for (; fmt->type; fmt++)
+		switch (fmt->type) {
+		case FORMAT_TYPE_END:
+			BUG("we should have already left the loop!");
+			break;
+		case FORMAT_TYPE_OBJECTNAME:
+			strbuf_add_oid_hex(sb, &data->oid);
+			break;
+		case FORMAT_TYPE_OBJECTTYPE:
+			strbuf_addstr(sb, type_name(data->type));
+			break;
+		case FORMAT_TYPE_OBJECTSIZE:
+			strbuf_add_uint(sb, data->size);
+			break;
+		case FORMAT_TYPE_OBJECTSIZE_DISK:
+			strbuf_add_uint(sb, data->disk_size);
+			break;
+		case FORMAT_TYPE_REST:
+			strbuf_addstr(sb, data->rest);
+			break;
+		case FORMAT_TYPE_DELTABASE:
+			strbuf_add_oid_hex(sb, &data->delta_base_oid);
+			break;
+		case FORMAT_TYPE_OBJECTMODE:
+			if (data->mode != S_IFINVALID)
+				strbuf_addf(sb, "%06o", data->mode);
+			break;
+		case FORMAT_TYPE_LITERAL:
+			strbuf_add(sb, fmt->u.literal.p, fmt->u.literal.len);
+		}
 }
 
 static void batch_write(struct batch_options *opt, const void *data, int len)

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] cat-file: speed up default format
  2026-06-15 17:06   ` Jeff King
@ 2026-06-15 21:53     ` René Scharfe
  0 siblings, 0 replies; 6+ messages in thread
From: René Scharfe @ 2026-06-15 21:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On 6/15/26 7:06 PM, Jeff King wrote:
> On Mon, Jun 15, 2026 at 12:53:26PM -0400, Jeff King wrote:
> 
>> It uses per-atom callback functions which is nice and clean, though we
>> might be able to do even better with a big ugly switch() statement.
> 
> Being the curious sort, I swapped it out for a big switch statement.
> Patch below, but it does not seem to be any faster.
> 
> So the bottom line is I think you could gain a little bit of performance
> by pre-parsing (versus strbuf_expand() on each object). Around 3% for
> something that actually looks at the objects, though more like 15% if
> for just dumping the objectnames.
> 
> IMHO that is probably not worth it for a custom parsing system just for
> cat-file.  But if we were to finally unify ref-filter and cat-file (and
> even --pretty=format) then it would probably worth doing this kind of
> pre-parsing.
It could be worth it for cat-file alone if we find the right balance, as
it already does do a separate parsing step, but that is awkward with its
mark_query checks all over the place and remembers only object property
requirements and no other format string details.

Making the opcodes small should be beneficial.  We need only a handful
of them, so a byte each should suffice.  We can use a strbuf for that.

We can also store literal characters in there.  An opcode plus with a
payload char incurs an overhead of 50%, which sounds high, but at least
the default format only has two of them and it's much better than
storing pointer plus size for an overhead of more than 90% in case of a
single char.

That gets us closer to native speed, at least on an Apple M1:

Benchmark 1: ./git_fp cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
  Time (mean ± σ):     992.7 ms ±   3.2 ms    [User: 967.5 ms, System: 23.8 ms]
  Range (min … max):   990.1 ms … 1000.7 ms    10 runs

Benchmark 2: ./git_switch cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
  Time (mean ± σ):     991.8 ms ±   1.6 ms    [User: 967.0 ms, System: 23.3 ms]
  Range (min … max):   989.3 ms … 994.4 ms    10 runs

Benchmark 3: ./git cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
  Time (mean ± σ):     985.8 ms ±   2.9 ms    [User: 960.5 ms, System: 23.6 ms]
  Range (min … max):   982.9 ms … 993.0 ms    10 runs

Benchmark 4: ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
  Time (mean ± σ):     982.1 ms ±   3.2 ms    [User: 956.7 ms, System: 23.6 ms]
  Range (min … max):   979.2 ms … 989.2 ms    10 runs

Summary
  ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' ran
    1.00 ± 0.00 times faster than ./git cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
    1.01 ± 0.00 times faster than ./git_switch cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
    1.01 ± 0.00 times faster than ./git_fp cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'


A Ryzen laptop gives me noisy numbers that seem to suggest your
switch-based code already won, but the more compact representation is at
least not worse:

Benchmark 1: ./git_fp cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
  Time (mean ± σ):     397.5 ms ±   8.0 ms    [User: 326.9 ms, System: 39.4 ms]
  Range (min … max):   388.1 ms … 410.0 ms    10 runs

Benchmark 2: ./git_switch cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
  Time (mean ± σ):     388.2 ms ±   4.2 ms    [User: 318.2 ms, System: 39.2 ms]
  Range (min … max):   382.8 ms … 395.7 ms    10 runs

Benchmark 3: ./git cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
  Time (mean ± σ):     385.5 ms ±   5.7 ms    [User: 311.2 ms, System: 43.2 ms]
  Range (min … max):   377.0 ms … 392.9 ms    10 runs

Benchmark 4: ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'
  Time (mean ± σ):     397.5 ms ±   8.4 ms    [User: 321.9 ms, System: 45.2 ms]
  Range (min … max):   382.1 ms … 406.5 ms    10 runs

Summary
  ./git cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)' ran
    1.01 ± 0.02 times faster than ./git_switch cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
    1.03 ± 0.03 times faster than ./git_fp cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)'
    1.03 ± 0.03 times faster than ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)'

René


diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 0d1998784c..5667a13e93 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -36,8 +36,6 @@ enum batch_mode {
 	BATCH_MODE_QUEUE_AND_DISPATCH,
 };
 
-struct format_item;
-
 struct batch_options {
 	struct list_objects_filter_options objects_filter;
 	int enabled;
@@ -50,7 +48,7 @@ struct batch_options {
 	char input_delim;
 	char output_delim;
 	const char *format;
-	struct format_item *parsed_format;
+	struct strbuf parsed_format;
 };
 
 static const char *force_path;
@@ -320,30 +318,16 @@ struct expand_data {
 };
 #define EXPAND_DATA_INIT  { .mode = S_IFINVALID }
 
-struct format_item {
-	enum {
-		FORMAT_TYPE_END = 0,
-		FORMAT_TYPE_LITERAL,
-		FORMAT_TYPE_OBJECTNAME,
-		FORMAT_TYPE_OBJECTTYPE,
-		FORMAT_TYPE_OBJECTSIZE,
-		FORMAT_TYPE_OBJECTSIZE_DISK,
-		FORMAT_TYPE_REST,
-		FORMAT_TYPE_DELTABASE,
-		FORMAT_TYPE_OBJECTMODE,
-	} type;
-	union {
-		struct {
-			const char *p;
-			size_t len;
-		} literal;
-	} u;
-	/*
-	 * We could make a true tree here with child/next pointers, which would
-	 * be necessary if we had recursive formats, like %(if). But for our
-	 * simple formats for now it is enough to have a linear set of items,
-	 * so we'll just allocate an array and terminate it with a NULL entry.
-	 */
+
+enum item_type {
+	FORMAT_TYPE_LITERAL,
+	FORMAT_TYPE_OBJECTNAME,
+	FORMAT_TYPE_OBJECTTYPE,
+	FORMAT_TYPE_OBJECTSIZE,
+	FORMAT_TYPE_OBJECTSIZE_DISK,
+	FORMAT_TYPE_REST,
+	FORMAT_TYPE_DELTABASE,
+	FORMAT_TYPE_OBJECTMODE,
 };
 
 static int is_atom(const char *atom, const char *s, int slen)
@@ -352,84 +336,66 @@ static int is_atom(const char *atom, const char *s, int slen)
 	return alen == slen && !memcmp(atom, s, alen);
 }
 
-static int parse_atom(struct format_item *fmt, const char *atom, int len,
+static int parse_atom(struct strbuf *parsed_format, const char *atom, int len,
 		      struct expand_data *data)
 {
 	if (is_atom("objectname", atom, len)) {
-		fmt->type = FORMAT_TYPE_OBJECTNAME;
+		strbuf_addch(parsed_format, FORMAT_TYPE_OBJECTNAME);
 	} else if (is_atom("objecttype", atom, len)) {
 		data->info.typep = &data->type;
-		fmt->type = FORMAT_TYPE_OBJECTTYPE;
+		strbuf_addch(parsed_format, FORMAT_TYPE_OBJECTTYPE);
 	} else if (is_atom("objectsize", atom, len)) {
 		data->info.sizep = &data->size;
-		fmt->type = FORMAT_TYPE_OBJECTSIZE;
+		strbuf_addch(parsed_format, FORMAT_TYPE_OBJECTSIZE);
 	} else if (is_atom("objectsize:disk", atom, len)) {
 		data->info.disk_sizep = &data->disk_size;
-		fmt->type = FORMAT_TYPE_OBJECTSIZE_DISK;
+		strbuf_addch(parsed_format, FORMAT_TYPE_OBJECTSIZE_DISK);
 	} else if (is_atom("rest", atom, len)) {
 		data->split_on_whitespace = 1;
-		fmt->type = FORMAT_TYPE_REST;
+		strbuf_addch(parsed_format, FORMAT_TYPE_REST);
 	} else if (is_atom("deltabase", atom, len)) {
 		data->info.delta_base_oid = &data->delta_base_oid;
-		fmt->type = FORMAT_TYPE_DELTABASE;
+		strbuf_addch(parsed_format, FORMAT_TYPE_DELTABASE);
 	} else if (is_atom("objectmode", atom, len)) {
-		fmt->type = FORMAT_TYPE_OBJECTMODE;
+		strbuf_addch(parsed_format, FORMAT_TYPE_OBJECTMODE);
 	} else
 		return 0;
 	return 1;
 }
 
-static struct format_item *parse_format(const char *start,
-					struct expand_data *data)
+static void parse_format(struct strbuf *parsed_format,
+			 const char *start, struct expand_data *data)
 {
-	struct format_item *ret = NULL;
-	size_t nr = 0, alloc = 0;
-
 	while (1) {
-		const char *percent = strchrnul(start, '%');
 		const char *end;
 
-		if (percent != start) {
-			ALLOC_GROW(ret, nr + 1, alloc);
-			ret[nr].type = FORMAT_TYPE_LITERAL;
-			ret[nr].u.literal.p = start;
-			ret[nr].u.literal.len = percent - start;
-			nr++;
+		while (*start && *start != '%') {
+			strbuf_addch(parsed_format, FORMAT_TYPE_LITERAL);
+			strbuf_addch(parsed_format, *start++);
 		}
 
-		if (!*percent)
+		if (!*start)
 			break;
 
-		start = percent + 1;
+		start++;
 
-		ALLOC_GROW(ret, nr + 1, alloc);
 		if (skip_prefix(start, "%", &start) || *start != '(') {
-			ret[nr].type = FORMAT_TYPE_LITERAL;
-			ret[nr].u.literal.p = "%";
-			ret[nr].u.literal.len = 1;
+			strbuf_addch(parsed_format, FORMAT_TYPE_LITERAL);
+			strbuf_addch(parsed_format, '%');
 		} else if ((end = strchr(start + 1, ')')) &&
-			   parse_atom(&ret[nr], start + 1, end - start - 1, data)) {
+			   parse_atom(parsed_format, start + 1, end - start - 1, data)) {
 			start = end + 1;
 		} else {
 			strbuf_expand_bad_format(start, "cat-file");
 		}
-		nr++;
 	}
-
-	ALLOC_GROW(ret, nr + 1, alloc);
-	ret[nr].type = FORMAT_TYPE_END;
-
-	return ret;
 }
 
-static void expand_format(struct strbuf *sb, struct format_item *fmt,
+static void expand_format(struct strbuf *sb, struct strbuf *parsed_format,
 			  struct expand_data *data)
 {
-	for (; fmt->type; fmt++)
-		switch (fmt->type) {
-		case FORMAT_TYPE_END:
-			BUG("we should have already left the loop!");
-			break;
+	for (size_t i = 0; i < parsed_format->len; i++)
+		switch (parsed_format->buf[i]) {
 		case FORMAT_TYPE_OBJECTNAME:
 			strbuf_add_oid_hex(sb, &data->oid);
 			break;
@@ -453,7 +419,7 @@ static void expand_format(struct strbuf *sb, struct format_item *fmt,
 				strbuf_addf(sb, "%06o", data->mode);
 			break;
 		case FORMAT_TYPE_LITERAL:
-			strbuf_add(sb, fmt->u.literal.p, fmt->u.literal.len);
+			strbuf_addch(sb, parsed_format->buf[++i]);
 		}
 }
 
@@ -641,7 +607,7 @@ static void batch_object_write(const char *obj_name,
 	if (!opt->format) {
 		print_default_format(scratch, data, opt);
 	} else {
-		expand_format(scratch, opt->parsed_format, data);
+		expand_format(scratch, &opt->parsed_format, data);
 		strbuf_addch(scratch, opt->output_delim);
 	}
 
@@ -1010,9 +976,8 @@ static int batch_objects(struct batch_options *opt)
 	int save_warning;
 	int retval = 0;
 
-	opt->parsed_format = parse_format(opt->format ?
-					  opt->format : DEFAULT_FORMAT,
-					  &data);
+	parse_format(&opt->parsed_format,
+		     opt->format ? opt->format : DEFAULT_FORMAT, &data);
 	if (opt->transform_mode)
 		data.split_on_whitespace = 1;
 
@@ -1152,6 +1117,7 @@ int cmd_cat_file(int argc,
 	const char *exp_type = NULL, *obj_name = NULL;
 	struct batch_options batch = {
 		.objects_filter = LIST_OBJECTS_FILTER_INIT,
+		.parsed_format = STRBUF_INIT,
 	};
 	int unknown_type = 0;
 	int input_nul_terminated = 0;


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] cat-file: speed up default format
  2026-06-15 16:53 ` Jeff King
  2026-06-15 17:06   ` Jeff King
@ 2026-06-15 21:53   ` René Scharfe
  1 sibling, 0 replies; 6+ messages in thread
From: René Scharfe @ 2026-06-15 21:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On 6/15/26 6:53 PM, Jeff King wrote:
> 
> +static void rest_add(struct format_item *item UNUSED,
> +		     struct strbuf *sb, struct expand_data *data)
> +{
> +	strbuf_addstr(sb, data->rest);
> +}

>  	} else if (is_atom("rest", atom, len)) {
> -		if (data->mark_query)
> -			data->split_on_whitespace = 1;
> -		else if (data->rest)

This removes support for rest being NULL, breaking t1006.381.

> -			strbuf_addstr(sb, data->rest);
> +		data->split_on_whitespace = 1;
> +		fmt->add = rest_add;
René


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-15 21:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-14 16:28 [PATCH] cat-file: speed up default format René Scharfe
2026-06-15  7:27 ` Patrick Steinhardt
2026-06-15 16:53 ` Jeff King
2026-06-15 17:06   ` Jeff King
2026-06-15 21:53     ` René Scharfe
2026-06-15 21:53   ` René Scharfe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox