All of lore.kernel.org
 help / color / mirror / Atom feed
* {RFC/PATCH] micro-optimize get_sha1_hex()
@ 2006-09-09 21:55 Junio C Hamano
  2006-09-09 22:33 ` Jeff Garzik
  2006-09-10  0:06 ` Linus Torvalds
  0 siblings, 2 replies; 4+ messages in thread
From: Junio C Hamano @ 2006-09-09 21:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

I was profiling 'git-rev-list v2.16.12..', because I suspected
insert_by_date() might be expensive (the function inserts into
singly-linked ordered list, so the data structure has to become
array based to allow optimization).  But profiling showed it was
not the bottleneck.  Probably because the kernel history is not
that bushy and we do not have too many "active" heads during
traversal.

But I noticed something else.

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 21.52      0.17     0.17  2800320     0.00     0.00  hexval
 15.19      0.29     0.12    70008     0.00     0.00  get_sha1_hex
 15.19      0.41     0.12    70008     0.00     0.00  lookup_object
 15.19      0.53     0.12    68495     0.00     0.00  find_pack_entry_one
 11.39      0.62     0.09   198667     0.00     0.00  insert_obj_hash
  7.60      0.68     0.06    34675     0.00     0.00  unpack_object_header_gently
  3.80      0.71     0.03    33819     0.00     0.02  parse_commit_buffer
  1.27      0.72     0.01   103822     0.00     0.00  commit_list_insert
  1.27      0.73     0.01    67640     0.00     0.00  prepare_packed_git
  1.27      0.74     0.01    67611     0.00     0.00  created_object
  1.27      0.75     0.01    33820     0.00     0.00  use_packed_git
  1.27      0.76     0.01    33819     0.00     0.00  lookup_tree
  1.27      0.77     0.01        1    10.00    10.00  prepare_packed_git_one
  1.27      0.78     0.01                             parse_tree_indirect
  1.27      0.79     0.01                             verify_filename
  ...

The attached brings get_sha1_hex() down from 15.19% to 5.41%,
but I feel we should be able to do better.

Is this barking up the wrong tree?  Or did I pick a good target
but the shooter wasn't skilled enough?


diff --git a/sha1_file.c b/sha1_file.c
index 428d791..00aa364 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -26,26 +26,30 @@ const unsigned char null_sha1[20];
 
 static unsigned int sha1_file_open_flag = O_NOATIME;
 
-static unsigned hexval(char c)
-{
-	if (c >= '0' && c <= '9')
-		return c - '0';
-	if (c >= 'a' && c <= 'f')
-		return c - 'a' + 10;
-	if (c >= 'A' && c <= 'F')
-		return c - 'A' + 10;
-	return ~0;
-}
+static const unsigned char hexval[] = {
+	  0,   1,   2,   3,    4,   5,   6,   7, /* 30-37 */
+	  8,   9, 255, 255,  255, 255, 255, 255, /* 38-3F */
+	255,  10,  11,  12,   13,  14,  15, 255, /* 40-47 */
+	255, 255, 255, 255,  255, 255, 255, 255, /* 48-4F */
+	255, 255, 255, 255,  255, 255, 255, 255, /* 50-57 */
+	255, 255, 255, 255,  255, 255, 255, 255, /* 58-5F */
+	255,  10,  11,  12,   13,  14,  15, 255, /* 60-67 */
+};
 
 int get_sha1_hex(const char *hex, unsigned char *sha1)
 {
 	int i;
 	for (i = 0; i < 20; i++) {
-		unsigned int val = (hexval(hex[0]) << 4) | hexval(hex[1]);
-		if (val & ~0xff)
+		unsigned int v, w, val;
+		v = *hex++;
+		if ((v < '0') || ('f' < v) ||
+		    ((v = hexval[v-'0']) == 255))
+			return -1;
+		w = *hex++;
+		if ((w < '0') || ('f' < w) ||
+		    ((w = hexval[w-'0']) == 255))
 			return -1;
-		*sha1++ = val;
-		hex += 2;
+		*sha1++ = (v << 4) | w;
 	}
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-09-10  0:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-09 21:55 {RFC/PATCH] micro-optimize get_sha1_hex() Junio C Hamano
2006-09-09 22:33 ` Jeff Garzik
2006-09-10  0:06 ` Linus Torvalds
2006-09-10  0:53   ` Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.