git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>
Cc: Git List <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] use strbuf_add_unique_abbrev() for adding short hashes, part 3
Date: Mon, 10 Oct 2016 22:46:21 +0200	[thread overview]
Message-ID: <af55f6d7-e1b1-272b-4fbe-a6eb2422b3be@web.de> (raw)
In-Reply-To: <20161010000035.mfcf55wqfcbcnarh@sigill.intra.peff.net>

Am 10.10.2016 um 02:00 schrieb Jeff King:
> On Sat, Oct 08, 2016 at 05:38:47PM +0200, René Scharfe wrote:
> 
>> Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
>> instead of taking detours through find_unique_abbrev() and its static
>> buffer.  This is shorter in most cases and a bit more efficient.
>>
>> The changes here are not easily handled by a semantic patch because
>> they involve removing temporary variables and deconstructing format
>> strings for strbuf_addf().
> 
> Yeah, the thing to look for here is whether the sha1 variable holds the
> same value at both times.
> 
> These all look OK to me. Mild rambling below.
> 
>> diff --git a/merge-recursive.c b/merge-recursive.c
>> index 5200d5c..9041c2f 100644
>> --- a/merge-recursive.c
>> +++ b/merge-recursive.c
>> @@ -202,9 +202,9 @@ static void output_commit_title(struct merge_options *o, struct commit *commit)
>>  		strbuf_addf(&o->obuf, "virtual %s\n",
>>  			merge_remote_util(commit)->name);
>>  	else {
>> -		strbuf_addf(&o->obuf, "%s ",
>> -			find_unique_abbrev(commit->object.oid.hash,
>> -				DEFAULT_ABBREV));
>> +		strbuf_add_unique_abbrev(&o->obuf, commit->object.oid.hash,
>> +					 DEFAULT_ABBREV);
>> +		strbuf_addch(&o->obuf, ' ');
> 
> I've often wondered whether a big strbuf_addf() is more efficient than
> several strbuf_addstrs. It amortizes the size-checks, certainly, though
> those are probably not very big. It shouldn't matter much for amortizing
> the cost of malloc, as it's very unlikely to have a case where:
> 
>   strbuf_addf("%s%s", foo, bar);
> 
> would require one malloc, but:
> 
>   strbuf_addstr(foo);
>   strbuf_addstr(bar);
> 
> would require two (one of the strings would have to be around the same
> size as the ALLOC_GROW() doubling).
> 
> So it probably doesn't matter much in practice (not that most of these
> cases are very performance sensitive anyway). Mostly just something I've
> pondered while tweaking strbuf invocations.

Good question.  ALLOC_GROW() doesn't double exactly, but indeed the
number of reallocations depends on the size of the added pieces.  I
always thought of strbuf_addf() as an expensive function for
convenience, but never timed it.

Numbers vary a bit, but here's what I get for the crude test program
at the end:

$ time t/helper/test-strbuf strbuf_addf 123 123456789012345678901234567890
123123456789012345678901234567890

real    0m0.168s
user    0m0.164s
sys     0m0.000s
$ time t/helper/test-strbuf strbuf_addstr 123 123456789012345678901234567890
123123456789012345678901234567890

real    0m0.141s
user    0m0.140s
sys     0m0.000s

Just a data-point, but it confirms my bias, so I stop here. :)

> Just thinking aloud, I've also wondered if strbuf_addoid() would be
> handy.  We already have oid_to_hex_r(). In fact, you could write it
> already as:
> 
>   strbuf_add_unique_abbrev(sb, oidp->hash, 0);
> 
> but that is probably not helping maintainability. ;)

Yes, a function for adding full hashes without using a static
variable is useful for library functions that may end up being
called often or in parallel.  I'd call it strbuf_add_hash,
though.



diff --git a/Makefile b/Makefile
index 1aad150..ad5e98c 100644
--- a/Makefile
+++ b/Makefile
@@ -628,6 +628,7 @@ TEST_PROGRAMS_NEED_X += test-scrap-cache-tree
 TEST_PROGRAMS_NEED_X += test-sha1
 TEST_PROGRAMS_NEED_X += test-sha1-array
 TEST_PROGRAMS_NEED_X += test-sigchain
+TEST_PROGRAMS_NEED_X += test-strbuf
 TEST_PROGRAMS_NEED_X += test-string-list
 TEST_PROGRAMS_NEED_X += test-submodule-config
 TEST_PROGRAMS_NEED_X += test-subprocess
diff --git a/t/helper/test-strbuf.c b/t/helper/test-strbuf.c
new file mode 100644
index 0000000..177f3e2
--- /dev/null
+++ b/t/helper/test-strbuf.c
@@ -0,0 +1,25 @@
+#include "cache.h"
+
+int cmd_main(int argc, const char **argv)
+{
+	struct strbuf sb = STRBUF_INIT;
+	int i = 1000000;
+
+	if (argc == 4) {
+		if (!strcmp(argv[1], "strbuf_addf")) {
+			while (i--) {
+				strbuf_release(&sb);
+				strbuf_addf(&sb, "%s%s", argv[2], argv[3]);
+			}
+		}
+		if (!strcmp(argv[1], "strbuf_addstr")) {
+			while (i--) {
+				strbuf_release(&sb);
+				strbuf_addstr(&sb, argv[2]);
+				strbuf_addstr(&sb, argv[3]);
+			}
+		}
+		puts(sb.buf);
+	}
+	return 0;
+}


  reply	other threads:[~2016-10-10 20:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-08 15:38 [PATCH] use strbuf_add_unique_abbrev() for adding short hashes, part 3 René Scharfe
2016-10-10  0:00 ` Jeff King
2016-10-10 20:46   ` René Scharfe [this message]
2016-10-10 20:52     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af55f6d7-e1b1-272b-4fbe-a6eb2422b3be@web.de \
    --to=l.s.r@web.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).