From: David Turner <dturner@twopensource.com>
To: "Ondřej Bílka" <neleai@seznam.cz>
Cc: git@vger.kernel.org, gitster@pobox.com, mhagger@alum.mit.edu,
David Turner <dturner@twitter.com>
Subject: Re: [PATCH v7 1/1] refs.c: SSE4.2 optimizations for check_refname_component
Date: Sun, 15 Jun 2014 01:53:40 -0400 [thread overview]
Message-ID: <1402811620.5629.77.camel@stross> (raw)
In-Reply-To: <20140614152209.GA14125@domone.podge>
On Sat, 2014-06-14 at 17:22 +0200, Ondřej Bílka wrote:
> On Thu, Jun 05, 2014 at 07:56:15PM -0400, David Turner wrote:
> > Optimize check_refname_component using SSE4.2, where available.
> >
> > git rev-parse HEAD is a good test-case for this, since it does almost
> > nothing except parse refs. For one particular repo with about 60k
> > refs, almost all packed, the timings are:
> >
> > Look up table: 29 ms
> > SSE4.2: 25 ms
> >
> > This is about a 15% improvement.
> >
> > The configure.ac changes include code from the GNU C Library written
> > by Joseph S. Myers <joseph at codesourcery dot com>.
> >
> > Only supports GCC and Clang at present, because C interfaces to the
> > cpuid instruction are not well-standardized.
> >
> Still a SSE4.2 is not that useful, in most cases SSE2 is faster. Here I
> think that difference will not be that big when correctly implemented.
> That will avoid a runtime checks.
Surprisingly to me, this is true! At least, on my machine. Sadly, the
only way to make it avoid a runtime check is to exclude 32-bit machines
(or to make the option non-default, which I would prefer not to do).
> For parallelisation you need to take extra step and paralelize whole
> check than going component-by-component.
Good idea.
> For detecting sequences a faster way is construct bitmasks with SSE2 so
> you could combine these. It avoids needing special casing on 16-byte
> boundaries.
That does seem to be faster.
> Below is untested implementation where you could add a bad character
> check with SSE4.2 which would speed it up. Are refs mostly
> alphanumerical? If so we could speed this up by paralelized alnum check
> and handling other characters in slower path.
Twitter's are almost entirely in [-._/a-zA-Z0-9] -- there are only a
handful of exceptions. So, a method that has some bycatch outside of
this range is just as fast as the SSE4.2 bad character check (but
somewhat more code).
next prev parent reply other threads:[~2014-06-15 5:54 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-05 23:56 [PATCH v7 0/1] refs.c: SSE4.2 optimizations for check_refname_component David Turner
2014-06-05 23:56 ` [PATCH v7 1/1] " David Turner
2014-06-14 15:22 ` Ondřej Bílka
2014-06-15 5:53 ` David Turner [this message]
2014-06-09 22:16 ` [PATCH v7 0/1] " Junio C Hamano
2014-06-09 22:39 ` David Turner
2014-06-09 23:05 ` Junio C Hamano
2014-06-10 6:04 ` Johannes Sixt
2014-06-10 6:55 ` Junio C Hamano
2014-06-13 1:18 ` David Turner
2014-06-13 4:11 ` Torsten Bögershausen
2014-06-14 10:24 ` Philip Oakley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1402811620.5629.77.camel@stross \
--to=dturner@twopensource.com \
--cc=dturner@twitter.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
--cc=neleai@seznam.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.