From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from cloud.peff.net (cloud.peff.net [217.216.95.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4FE740FD86 for ; Mon, 15 Jun 2026 16:53:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.216.95.84 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781542418; cv=none; b=sHy3ne84Ts7RUFMLhLnDur3j2kSkVuG5vHIU3xSTI9FeoAlpKYwsWlJBFUbXV4FKbriNcBnKz36yjrzLMuJ76uKt8dbvTMIqSsps7dZN1EngaryUMBD+xMk2STWpmgDkeRmIjM2kk9Y+JvF1Wduh0ozmB+UCE8QMOYJEvrTMRjk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781542418; c=relaxed/simple; bh=nufY0y73GDdLn6YV3u/l7mWQl3IzD7gAu1AWMgdIkzY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EJO5KUDQHPHSar/zlkb6T6nASIC28pYaMQHw088B3rWjCN/kh63b+glZsGGlJqSzJwBlT0vyq6mpKWvQfKyrkU5Nh/GtaBaxHwuQjrR2AsWXCXtErY2cNAvX79mmy8rgKccPLIlPxaysZX13cadkLVd4l9MUY+rC+NNAorR0DBU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=JztOCdsb; arc=none smtp.client-ip=217.216.95.84 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="JztOCdsb" Received: (qmail 144180 invoked by uid 106); 15 Jun 2026 16:53:27 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:content-transfer-encoding:in-reply-to; s=20240930; bh=nufY0y73GDdLn6YV3u/l7mWQl3IzD7gAu1AWMgdIkzY=; b=JztOCdsbLq/EO3uroOj7brB7koA/GxJ+9iUBhcmk/uHpC8QuiG9x52EZ6UN07BxW2eksEBPpn6PJCqq8xFXRKUgcZtZ+uUMMr9NGTUMh3sP4stWC4jMbgxSyVZjrwa/OUBkzrz/SZtV1tM6/b1EPjc8h8NVrjvqa1QKkDGIFlMT3q3SGU0LoG1mpzGRpfJ/ERWCQqhSGYB7SK8kdx61zyGdZoxdjseM45rIoFbXyiLr+Rl8Ki5RIG87pfukvxZZg16Q+1seyKuS4iI+HOq8y2dWb649KlOjTtIe03uv1zK993RGixhi7ZmihWj+17tbUU0I4vHOld6D7iD6Sf4f63Q== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 15 Jun 2026 16:53:27 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 394227 invoked by uid 111); 15 Jun 2026 16:53:28 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Mon, 15 Jun 2026 12:53:28 -0400 Authentication-Results: peff.net; auth=none Date: Mon, 15 Jun 2026 12:53:26 -0400 From: Jeff King To: =?utf-8?B?UmVuw6k=?= Scharfe Cc: Git List Subject: Re: [PATCH] cat-file: speed up default format Message-ID: <20260615165326.GA91269@coredump.intra.peff.net> References: <5a7ed929-6fe0-496c-83bd-65dee57c2241@web.de> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5a7ed929-6fe0-496c-83bd-65dee57c2241@web.de> On Sun, Jun 14, 2026 at 06:28:34PM +0200, René Scharfe wrote: > eb54a3391b (cat-file: skip expanding default format, 2022-03-15) added > special handling for the default batch format. In the meantime it has > fallen behind the code path for handling arbitrary formats. Bring it up > to speed by using the new and more efficient strbuf_add_oid_hex() and > strbuf_add_uint() instead of strbuf_addf(): > > Benchmark 1: ./git_main cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' > Time (mean ± σ): 1.051 s ± 0.003 s [User: 1.027 s, System: 0.023 s] > Range (min … max): 1.049 s … 1.058 s 10 runs > > Benchmark 2: ./git_main cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)' > Time (mean ± σ): 1.012 s ± 0.002 s [User: 0.988 s, System: 0.023 s] > Range (min … max): 1.010 s … 1.018 s 10 runs > > Benchmark 3: ./git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' > Time (mean ± σ): 979.0 ms ± 1.1 ms [User: 954.1 ms, System: 23.2 ms] > Range (min … max): 977.7 ms … 980.8 ms 10 runs Interesting that it was actually slower than a custom format. Using the default format saves the cost of strbuf_expand(), but it was paying the price of strbuf_addf(), which the custom path no longer used. So the cost of strbuf_addf() is more than strbuf_expand(), which is not all that surprising. Your patch seems obviously right, and everything below is idle speculation / nerd-sniping. I have long wondered if we could do better with a separate initial parse step, which would let us walk the parse tree for each object. In theory that tree is more compact. I think it would be a huge improvement for ref-filter, whose parser is complicated and slow (though its biggest sin is that it allocates a separate string for each atom before assembling the final output). But could it help even cat-file, which is using a pretty tight loop over strbuf_expand()? I sketched out a rough draft below. It uses per-atom callback functions which is nice and clean, though we might be able to do even better with a big ugly switch() statement. The timings I got are below (git.old is master with your patch here applied, and git.new is my patch on top). It looks like it does make a custom format ~3% faster. But it's still a shade slower than the default format. Not sure if it's the extra function calls, or if the static print_default_format() function gives the compiler more opportunities for optimization. Benchmark 1: ./git.old cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' Time (mean ± σ): 580.2 ms ± 5.0 ms [User: 558.7 ms, System: 21.5 ms] Range (min … max): 569.9 ms … 585.6 ms 10 runs Benchmark 2: ./git.new cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' Time (mean ± σ): 580.4 ms ± 5.1 ms [User: 562.7 ms, System: 17.8 ms] Range (min … max): 571.8 ms … 587.0 ms 10 runs Benchmark 3: ./git.old cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)' Time (mean ± σ): 618.6 ms ± 5.0 ms [User: 598.9 ms, System: 19.7 ms] Range (min … max): 613.6 ms … 628.3 ms 10 runs Benchmark 4: ./git.new cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)' Time (mean ± σ): 600.2 ms ± 4.2 ms [User: 581.2 ms, System: 19.0 ms] Range (min … max): 595.2 ms … 608.8 ms 10 runs Summary ./git.old cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' ran 1.00 ± 0.01 times faster than ./git.new cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype) %(objectsize)' 1.03 ± 0.01 times faster than ./git.new cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)' 1.07 ± 0.01 times faster than ./git.old cat-file --batch-all-objects --batch-check='%(objectname)-%(objecttype)-%(objectsize)' Patch below, only lightly tested. --- diff --git a/builtin/cat-file.c b/builtin/cat-file.c index d7f7895e30..9cc7ec7a6f 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -36,6 +36,8 @@ enum batch_mode { BATCH_MODE_QUEUE_AND_DISPATCH, }; +struct format_item; + struct batch_options { struct list_objects_filter_options objects_filter; int enabled; @@ -48,6 +50,7 @@ struct batch_options { char input_delim; char output_delim; const char *format; + struct format_item *parsed_format; }; static const char *force_path; @@ -294,12 +297,6 @@ struct expand_data { const char *rest; struct object_id delta_base_oid; - /* - * If mark_query is true, we do not expand anything, but rather - * just mark the object_info with items we wish to query. - */ - int mark_query; - /* * Whether to split the input on whitespace before feeding it to * get_sha1; this is decided during the mark_query phase based on @@ -323,65 +320,152 @@ struct expand_data { }; #define EXPAND_DATA_INIT { .mode = S_IFINVALID } +struct format_item { + void (*add)(struct format_item *item, struct strbuf *sb, struct expand_data *data); + union { + struct { + const char *p; + size_t len; + } literal; + } u; + /* + * We could make a true tree here with child/next pointers, which would + * be necessary if we had recursive formats, like %(if). But for our + * simple formats for now it is enough to have a linear set of items, + * so we'll just allocate an array and terminate it with a NULL entry. + */ +}; + +static void objectname_add(struct format_item *item UNUSED, + struct strbuf *sb, struct expand_data *data) +{ + strbuf_add_oid_hex(sb, &data->oid); +} + +static void objecttype_add(struct format_item *item UNUSED, + struct strbuf *sb, struct expand_data *data) +{ + strbuf_addstr(sb, type_name(data->type)); +} + +static void objectsize_add(struct format_item *item UNUSED, + struct strbuf *sb, struct expand_data *data) +{ + strbuf_add_uint(sb, data->size); +} + +static void objectsize_disk_add(struct format_item *item UNUSED, + struct strbuf *sb, struct expand_data *data) +{ + strbuf_add_uint(sb, data->disk_size); +} + +static void rest_add(struct format_item *item UNUSED, + struct strbuf *sb, struct expand_data *data) +{ + strbuf_addstr(sb, data->rest); +} + +static void deltabase_add(struct format_item *item UNUSED, + struct strbuf *sb, struct expand_data *data) +{ + strbuf_add_oid_hex(sb, &data->delta_base_oid); +} + +static void objectmode_add(struct format_item *item UNUSED, + struct strbuf *sb, struct expand_data *data) +{ + if (data->mode != S_IFINVALID) + strbuf_addf(sb, "%06o", data->mode); +} + +static void literal_add(struct format_item *item, + struct strbuf *sb, struct expand_data *data UNUSED) +{ + strbuf_add(sb, item->u.literal.p, item->u.literal.len); +} + static int is_atom(const char *atom, const char *s, int slen) { int alen = strlen(atom); return alen == slen && !memcmp(atom, s, alen); } -static int expand_atom(struct strbuf *sb, const char *atom, int len, - struct expand_data *data) +static int parse_atom(struct format_item *fmt, const char *atom, int len, + struct expand_data *data) { if (is_atom("objectname", atom, len)) { - if (!data->mark_query) - strbuf_add_oid_hex(sb, &data->oid); + fmt->add = objectname_add; } else if (is_atom("objecttype", atom, len)) { - if (data->mark_query) - data->info.typep = &data->type; - else - strbuf_addstr(sb, type_name(data->type)); + data->info.typep = &data->type; + fmt->add = objecttype_add; } else if (is_atom("objectsize", atom, len)) { - if (data->mark_query) - data->info.sizep = &data->size; - else - strbuf_add_uint(sb, data->size); + data->info.sizep = &data->size; + fmt->add = objectsize_add; } else if (is_atom("objectsize:disk", atom, len)) { - if (data->mark_query) - data->info.disk_sizep = &data->disk_size; - else - strbuf_add_uint(sb, data->disk_size); + data->info.disk_sizep = &data->disk_size; + fmt->add = objectsize_disk_add; } else if (is_atom("rest", atom, len)) { - if (data->mark_query) - data->split_on_whitespace = 1; - else if (data->rest) - strbuf_addstr(sb, data->rest); + data->split_on_whitespace = 1; + fmt->add = rest_add; } else if (is_atom("deltabase", atom, len)) { - if (data->mark_query) - data->info.delta_base_oid = &data->delta_base_oid; - else - strbuf_add_oid_hex(sb, &data->delta_base_oid); + data->info.delta_base_oid = &data->delta_base_oid; + fmt->add = deltabase_add; } else if (is_atom("objectmode", atom, len)) { - if (!data->mark_query && !(S_IFINVALID == data->mode)) - strbuf_addf(sb, "%06o", data->mode); + fmt->add = objectmode_add; } else return 0; return 1; } -static void expand_format(struct strbuf *sb, const char *start, - struct expand_data *data) +static struct format_item *parse_format(const char *start, + struct expand_data *data) { - while (strbuf_expand_step(sb, &start)) { + struct format_item *ret = NULL; + size_t nr = 0, alloc = 0; + + while (1) { + const char *percent = strchrnul(start, '%'); const char *end; - if (skip_prefix(start, "%", &start) || *start != '(') - strbuf_addch(sb, '%'); - else if ((end = strchr(start + 1, ')')) && - expand_atom(sb, start + 1, end - start - 1, data)) + if (percent != start) { + ALLOC_GROW(ret, nr + 1, alloc); + ret[nr].add = literal_add; + ret[nr].u.literal.p = start; + ret[nr].u.literal.len = percent - start; + nr++; + } + + if (!*percent) + break; + + start = percent + 1; + + ALLOC_GROW(ret, nr + 1, alloc); + if (skip_prefix(start, "%", &start) || *start != '(') { + ret[nr].add = literal_add; + ret[nr].u.literal.p = "%"; + ret[nr].u.literal.len = 1; + } else if ((end = strchr(start + 1, ')')) && + parse_atom(&ret[nr], start + 1, end - start - 1, data)) { start = end + 1; - else + } else { strbuf_expand_bad_format(start, "cat-file"); + } + nr++; } + + ALLOC_GROW(ret, nr + 1, alloc); + ret[nr].add = NULL; + + return ret; +} + +static void expand_format(struct strbuf *sb, struct format_item *fmt, + struct expand_data *data) +{ + for (; fmt->add; fmt++) + fmt->add(fmt, sb, data); } static void batch_write(struct batch_options *opt, const void *data, int len) @@ -568,7 +652,7 @@ static void batch_object_write(const char *obj_name, if (!opt->format) { print_default_format(scratch, data, opt); } else { - expand_format(scratch, opt->format, data); + expand_format(scratch, opt->parsed_format, data); strbuf_addch(scratch, opt->output_delim); } @@ -936,17 +1020,9 @@ static int batch_objects(struct batch_options *opt) int save_warning; int retval = 0; - /* - * Expand once with our special mark_query flag, which will prime the - * object_info to be handed to odb_read_object_info_extended for each - * object. - */ - data.mark_query = 1; - expand_format(&output, - opt->format ? opt->format : DEFAULT_FORMAT, - &data); - data.mark_query = 0; - strbuf_release(&output); + opt->parsed_format = parse_format(opt->format ? + opt->format : DEFAULT_FORMAT, + &data); if (opt->transform_mode) data.split_on_whitespace = 1;