From: David Turner <dturner@twopensource.com>
To: Michael Haggerty <mhagger@alum.mit.edu>,
Junio C Hamano <gitster@pobox.com>,
git@vger.kernel.org
Cc: David Turner <dturner@twitter.com>
Subject: [PATCH v4 1/2] refs.c: optimize check_refname_component()
Date: Sun, 1 Jun 2014 01:17:44 -0400 [thread overview]
Message-ID: <1401599865-14117-1-git-send-email-dturner@twitter.com> (raw)
In a repository with many refs, check_refname_component can be a major
contributor to the runtime of some git commands. One such command is
git rev-parse HEAD
Timings for one particular repo, with about 60k refs, almost all
packed, are:
Old: 35 ms
New: 29 ms
Many other commands which read refs are also sped up.
Signed-off-by: David Turner <dturner@twitter.com>
---
refs.c | 68 ++++++++++++++++++++++++++++++++++++++++--------------------------
1 file changed, 41 insertions(+), 27 deletions(-)
diff --git a/refs.c b/refs.c
index 28d5eca..62e2301 100644
--- a/refs.c
+++ b/refs.c
@@ -5,9 +5,32 @@
#include "dir.h"
#include "string-list.h"
+/* How to handle various characters in refnames:
+ * 0: An acceptable character for refs
+ * 1: End-of-component
+ * 2: ., look for a following . to reject .. in refs
+ * 3: @, look for a following { to reject @{ in refs
+ * 9: A bad character, reject ref
+ *
+ * See below for the list of illegal characters, from which
+ * this table is derived.
+ */
+static unsigned char refname_disposition[] = {
+ 1, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+ 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
+ 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 2, 1,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 9,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 0, 9, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 9, 9
+};
+
/*
- * Make sure "ref" is something reasonable to have under ".git/refs/";
- * We do not like it if:
+ * Try to read one refname component from the front of refname.
+ * Return the length of the component found, or -1 if the component is
+ * not legal. It is legal if it is something reasonable to have under
+ * ".git/refs/"; We do not like it if:
*
* - any path component of it begins with ".", or
* - it has double dots "..", or
@@ -15,24 +38,7 @@
* - it ends with a "/".
* - it ends with ".lock"
* - it contains a "\" (backslash)
- */
-/* Return true iff ch is not allowed in reference names. */
-static inline int bad_ref_char(int ch)
-{
- if (((unsigned) ch) <= ' ' || ch == 0x7f ||
- ch == '~' || ch == '^' || ch == ':' || ch == '\\')
- return 1;
- /* 2.13 Pattern Matching Notation */
- if (ch == '*' || ch == '?' || ch == '[') /* Unsupported */
- return 1;
- return 0;
-}
-
-/*
- * Try to read one refname component from the front of refname. Return
- * the length of the component found, or -1 if the component is not
- * legal.
*/
static int check_refname_component(const char *refname, int flags)
{
@@ -40,17 +46,25 @@ static int check_refname_component(const char *refname, int flags)
char last = '\0';
for (cp = refname; ; cp++) {
- char ch = *cp;
- if (ch == '\0' || ch == '/')
+ unsigned char ch = (unsigned char) *cp;
+ char disp = refname_disposition[ch];
+ switch(disp) {
+ case 1:
+ goto out;
+ case 2:
+ if (last == '.')
+ return -1; /* Refname contains "..". */
break;
- if (bad_ref_char(ch))
- return -1; /* Illegal character in refname. */
- if (last == '.' && ch == '.')
- return -1; /* Refname contains "..". */
- if (last == '@' && ch == '{')
- return -1; /* Refname contains "@{". */
+ case 3:
+ if (last == '@')
+ return -1; /* Refname contains "@{". */
+ break;
+ case 9:
+ return -1;
+ }
last = ch;
}
+out:
if (cp == refname)
return 0; /* Component has zero length. */
if (refname[0] == '.') {
--
2.0.0.rc1.18.gf763c0f
next reply other threads:[~2014-06-01 5:18 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-01 5:17 David Turner [this message]
2014-06-01 5:17 ` [PATCH v4 2/2] refs.c: SSE4.2 optimizations for check_refname_component David Turner
2014-06-01 7:17 ` [PATCH v4 1/2] refs.c: optimize check_refname_component() Andreas Schwab
2014-06-01 19:43 ` Philip Oakley
2014-06-01 20:50 ` Michael Haggerty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1401599865-14117-1-git-send-email-dturner@twitter.com \
--to=dturner@twopensource.com \
--cc=dturner@twitter.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).