git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Cc: Paul Mackerras <paulus@samba.org>,
	Git Mailing List <git@vger.kernel.org>,
	Pierre Habouzit <madcoder@debian.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: [PATCH 3/3] --format=pretty: avoid calculating expensive expansions twice
Date: Sat, 10 Nov 2007 12:18:26 +0100	[thread overview]
Message-ID: <47359382.1010600@lsrfire.ath.cx> (raw)
In-Reply-To: <20071110004635.GA14992@sigill.intra.peff.net>

As Jeff King remarked, format strings with duplicate placeholders can
be slow to expand, because each instance is calculated anew.

This patch makes use of the fact that format_commit_message() and its
helper functions only ever add stuff to the end of the strbuf.  For
certain expensive placeholders, store the offset and length of their
expansion with the strbuf at the first occurrence.  Later they
expansion result can simply be copied from there -- no malloc() or
strdup() required.

These certain placeholders are the abbreviated commit, tree and
parent hashes, as the search for a unique abbreviated hash is quite
costly.  Here are the times for next (best of three runs):

$ time git log --pretty=format:%h >/dev/null

real    0m0.611s
user    0m0.404s
sys     0m0.204s

$ time git log --pretty=format:%h%h%h%h >/dev/null

real    0m1.206s
user    0m0.744s
sys     0m0.452s

And here those with this patch (and the previous two); the speedup
of the single placeholder case is just noise:

$ time git log --pretty=format:%h >/dev/null

real    0m0.608s
user    0m0.416s
sys     0m0.192s

$ time git log --pretty=format:%h%h%h%h >/dev/null

real    0m0.639s
user    0m0.488s
sys     0m0.140s

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
 pretty.c |   32 ++++++++++++++++++++++++++++++++
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/pretty.c b/pretty.c
index 0c2f83b..ab142b8 100644
--- a/pretty.c
+++ b/pretty.c
@@ -369,8 +369,30 @@ struct format_commit_context {
 	struct chunk committer;
 	struct chunk encoding;
 	size_t body_off;
+
+	/* The following ones are relative to the result struct strbuf. */
+	struct chunk abbrev_commit_hash;
+	struct chunk abbrev_tree_hash;
+	struct chunk abbrev_parent_hashes;
 };
 
+static int add_again(struct strbuf *sb, struct chunk *chunk)
+{
+	if (chunk->len) {
+		strbuf_adddup(sb, chunk->off, chunk->len);
+		return 1;
+	}
+
+	/*
+	 * We haven't seen this chunk before.  Our caller is surely
+	 * going to add it the hard way now.  Remember the most likely
+	 * start of the to-be-added chunk: the current end of the
+	 * struct strbuf.
+	 */
+	chunk->off = sb->len;
+	return 0;
+}
+
 static void parse_commit_header(struct format_commit_context *context)
 {
 	const char *msg = context->commit->buffer;
@@ -447,15 +469,21 @@ static void format_commit_item(struct strbuf *sb, const char *placeholder,
 		strbuf_addstr(sb, sha1_to_hex(commit->object.sha1));
 		return;
 	case 'h':		/* abbreviated commit hash */
+		if (add_again(sb, &c->abbrev_commit_hash))
+			return;
 		strbuf_addstr(sb, find_unique_abbrev(commit->object.sha1,
 		                                     DEFAULT_ABBREV));
+		c->abbrev_commit_hash.len = sb->len - c->abbrev_commit_hash.off;
 		return;
 	case 'T':		/* tree hash */
 		strbuf_addstr(sb, sha1_to_hex(commit->tree->object.sha1));
 		return;
 	case 't':		/* abbreviated tree hash */
+		if (add_again(sb, &c->abbrev_tree_hash))
+			return;
 		strbuf_addstr(sb, find_unique_abbrev(commit->tree->object.sha1,
 		                                     DEFAULT_ABBREV));
+		c->abbrev_tree_hash.len = sb->len - c->abbrev_tree_hash.off;
 		return;
 	case 'P':		/* parent hashes */
 		for (p = commit->parents; p; p = p->next) {
@@ -465,12 +493,16 @@ static void format_commit_item(struct strbuf *sb, const char *placeholder,
 		}
 		return;
 	case 'p':		/* abbreviated parent hashes */
+		if (add_again(sb, &c->abbrev_parent_hashes))
+			return;
 		for (p = commit->parents; p; p = p->next) {
 			if (p != commit->parents)
 				strbuf_addch(sb, ' ');
 			strbuf_addstr(sb, find_unique_abbrev(
 					p->item->object.sha1, DEFAULT_ABBREV));
 		}
+		c->abbrev_parent_hashes.len = sb->len -
+		                              c->abbrev_parent_hashes.off;
 		return;
 	case 'm':		/* left/right/bottom */
 		strbuf_addch(sb, (commit->object.flags & BOUNDARY)
-- 
1.5.3.5.1651.g30bf

  parent reply	other threads:[~2007-11-10 11:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-09  0:49 [PATCH 2/2] --pretty=format: on-demand format expansion René Scharfe
2007-11-09  1:24 ` Johannes Schindelin
2007-11-09 21:13   ` René Scharfe
2007-11-09 22:18     ` Johannes Schindelin
2007-11-09 22:30       ` Junio C Hamano
2007-11-09  4:50 ` Jeff King
2007-11-09 23:16   ` René Scharfe
2007-11-10  0:31     ` Johannes Schindelin
2007-11-10  0:49       ` Jeff King
2007-11-10  0:46     ` Jeff King
2007-11-10 11:12       ` René Scharfe
2007-11-10 16:07         ` Johannes Schindelin
2007-11-10 16:24           ` René Scharfe
2007-11-10 20:36           ` Jeff King
2007-11-10 20:34         ` Jeff King
2007-11-11  8:13           ` Jeff King
2007-11-10 11:14       ` [PATCH 1/3] --pretty=format: parse commit message only once René Scharfe
2007-11-10 11:16       ` [PATCH 2/3] add strbuf_adddup() René Scharfe
2007-11-10 11:18       ` René Scharfe [this message]
2007-11-11 10:29         ` [PATCH 3/3] --format=pretty: avoid calculating expensive expansions twice Junio C Hamano
2007-11-09  4:52 ` [PATCH 2/2] --pretty=format: on-demand format expansion Jeff King
2007-11-09 23:20   ` René Scharfe
2007-11-10  0:51     ` Jeff King
2007-11-09 23:39 ` Paul Mackerras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47359382.1010600@lsrfire.ath.cx \
    --to=rene.scharfe@lsrfire.ath.cx \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=madcoder@debian.org \
    --cc=paulus@samba.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).