From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: "brian m. carlson" <sandals@crustytoothpaste.net>,
Derrick Stolee <stolee@gmail.com>,
Jacob Keller <jacob.keller@gmail.com>
Subject: [PATCH 2/9] introduce hasheq() and oideq()
Date: Sat, 25 Aug 2018 04:05:44 -0400 [thread overview]
Message-ID: <20180825080543.GB737@sigill.intra.peff.net> (raw)
In-Reply-To: <20180825080031.GA32139@sigill.intra.peff.net>
The main comparison functions we provide for comparing
object ids are hashcmp() and oidcmp(). These are more
flexible than a strict equality check, since they also
express ordering. That makes them them useful for sorting
and binary searching. However, it also makes them
potentially slower than a strict equality check. Consider
this C code, which is traditionally what our hashcmp has
looked like:
#include <string.h>
int hashcmp(const unsigned char *a, const unsigned char *b)
{
return memcmp(a, b, 20);
}
Compiling with "gcc -O2 -S -fverbose-asm", the generated
assembly shows that we actually call memcmp(). But if we
change this to a strict equality check:
return !memcmp(a, b, 20);
we get a faster inline version:
movq (%rdi), %rax # MEM[(void *)a_4(D)], MEM[(void *)a_4(D)]
movq 8(%rdi), %rdx # MEM[(void *)a_4(D)], tmp101
xorq (%rsi), %rax # MEM[(void *)b_5(D)], tmp94
xorq 8(%rsi), %rdx # MEM[(void *)b_5(D)], tmp93
orq %rax, %rdx # tmp94, tmp93
jne .L2 #,
movl 16(%rsi), %eax # MEM[(void *)b_5(D)], tmp104
cmpl %eax, 16(%rdi) # tmp104, MEM[(void *)a_4(D)]
je .L5 #,
Obviously our hashcmp() doesn't include the "!". But because
it's an inline function, optimizing compilers are able to
see "!hashcmp(a,b)" in calling code and take advantage of
this case. So there has been no value thus far in
introducing a more restricted interface for doing strict
equality checks.
But as Git learns about more values for the_hash_algo, our
hashcmp() will grow more complicated and may even delegate
at runtime to functions optimized specifically for that hash
size. That breaks the inline connection we have, and the
compiler will have to assume that the caller really cares
about the sign and magnitude of the memcmp() result, even
though the vast majority don't.
We can solve that by introducing a hasheq() function (and
matching oideq() wrapper), which callers can use to make
clear that they only care about equality. For now, the
implementation will literally be "!hashcmp()", but it frees
us up later to introduce code optimized specifically for the
equality check.
Signed-off-by: Jeff King <peff@peff.net>
---
cache.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/cache.h b/cache.h
index 4d014541ab..f6d227fac7 100644
--- a/cache.h
+++ b/cache.h
@@ -1041,6 +1041,16 @@ static inline int oidcmp(const struct object_id *oid1, const struct object_id *o
return hashcmp(oid1->hash, oid2->hash);
}
+static inline int hasheq(const unsigned char *sha1, const unsigned char *sha2)
+{
+ return !hashcmp(sha1, sha2);
+}
+
+static inline int oideq(const struct object_id *oid1, const struct object_id *oid2)
+{
+ return hasheq(oid1->hash, oid2->hash);
+}
+
static inline int is_null_sha1(const unsigned char *sha1)
{
return !hashcmp(sha1, null_sha1);
--
2.19.0.rc0.412.g7005db4e88
next prev parent reply other threads:[~2018-08-25 8:05 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-25 8:00 [PATCH 0/9] introducing oideq() Jeff King
2018-08-25 8:03 ` [PATCH 1/9] coccinelle: use <...> for function exclusion Jeff King
2018-08-25 8:05 ` Jeff King [this message]
2018-08-25 10:58 ` [PATCH 2/9] introduce hasheq() and oideq() Andrei Rybak
2018-08-25 8:07 ` [PATCH 3/9] convert "oidcmp() == 0" to oideq() Jeff King
2018-08-25 8:36 ` Jeff King
2018-08-27 12:31 ` Derrick Stolee
2018-08-27 12:33 ` Derrick Stolee
2018-08-25 8:08 ` [PATCH 4/9] convert "hashcmp() == 0" to hasheq() Jeff King
2018-08-25 8:09 ` [PATCH 5/9] convert "oidcmp() != 0" to "!oideq()" Jeff King
2018-08-25 8:10 ` [PATCH 6/9] convert "hashcmp() != 0" to "!hasheq()" Jeff King
2018-08-25 8:14 ` [PATCH 7/9] convert hashmap comparison functions to oideq() Jeff King
2018-08-25 8:15 ` [PATCH 8/9] read-cache: use oideq() in ce_compare functions Jeff King
2018-08-25 8:17 ` [PATCH 9/9] show_dirstat: simplify same-content check Jeff King
2018-08-25 8:20 ` Eric Sunshine
2018-08-25 8:23 ` Jeff King
2018-08-26 20:56 ` [PATCH 0/9] introducing oideq() brian m. carlson
2018-08-27 12:41 ` Derrick Stolee
2018-08-28 21:21 ` Jeff King
2018-08-28 21:22 ` [PATCH v2 1/9] coccinelle: use <...> for function exclusion Jeff King
2018-08-28 21:22 ` [PATCH v2 2/9] introduce hasheq() and oideq() Jeff King
2018-08-28 21:22 ` [PATCH v2 3/9] convert "oidcmp() == 0" to oideq() Jeff King
2018-08-28 21:22 ` [PATCH v2 4/9] convert "hashcmp() == 0" to hasheq() Jeff King
2018-08-28 21:22 ` [PATCH v2 5/9] convert "oidcmp() != 0" to "!oideq()" Jeff King
2018-08-28 21:22 ` [PATCH v2 6/9] convert "hashcmp() != 0" to "!hasheq()" Jeff King
2018-08-28 21:22 ` [PATCH v2 7/9] convert hashmap comparison functions to oideq() Jeff King
2018-08-28 21:22 ` [PATCH v2 8/9] read-cache: use oideq() in ce_compare functions Jeff King
2018-08-28 21:23 ` [PATCH v2 9/9] show_dirstat: simplify same-content check Jeff King
2018-08-28 21:36 ` [PATCH 0/9] introducing oideq() Derrick Stolee
2018-08-29 0:08 ` brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180825080543.GB737@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=jacob.keller@gmail.com \
--cc=sandals@crustytoothpaste.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).