From: Junio C Hamano <gitster@pobox.com>
To: David Turner <dturner@twopensource.com>
Cc: git@vger.kernel.org, mhagger@alum.mit.edu, neleai@seznam.cz,
David Turner <dturner@twitter.com>
Subject: Re: [PATCH v8] refs.c: SSE2 optimizations for check_refname_component
Date: Mon, 16 Jun 2014 17:06:31 -0700 [thread overview]
Message-ID: <xmqqha3ko0vs.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1402946566-14923-1-git-send-email-dturner@twitter.com> (David Turner's message of "Mon, 16 Jun 2014 15:22:46 -0400")
David Turner <dturner@twopensource.com> writes:
> Optimize check_refname_component using SSE2 on x86_64.
>
> git rev-parse HEAD is a good test-case for this, since it does almost
> nothing except parse refs. For one particular repo with about 60k
> refs, almost all packed, the timings are:
>
> Look up table: 29 ms
> SSE2: 23 ms
>
> This cuts about 20% off of the runtime.
>
> The configure.ac changes include code from the GNU C Library written
> by Joseph S. Myers <joseph at codesourcery dot com>.
>
> Ondřej Bílka <neleai@seznam.cz> suggested an SSE2 approach to the
One e-mail address is obfuscated while the other not; intended?
> substring searches, which netted a speed boost over the SSE4.2 code I
> had initially written.
>
> Signed-off-by: David Turner <dturner@twitter.com>
> ---
> diff --git a/git-compat-util.h b/git-compat-util.h
> index f6d3a46..291d46b 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -668,6 +668,16 @@ void git_qsort(void *base, size_t nmemb, size_t size,
> #endif
> #endif
>
> +#if defined(__GNUC__) && defined(__x86_64__)
> +#include <emmintrin.h>
> +/* This is the system memory page size; it's used so that we can read
Style (there are other instances of the same kind).
/*
* This is the ...
> + * outside the bounds of an allocation without segfaulting.
> + */
> +static int check_refname_component_trailer(const char *cp, const char *refname, int flags)
> +{
> + if (cp == refname)
> + return 0; /* Component has zero length. */
> + if (refname[0] == '.') {
> + if (!(flags & REFNAME_DOT_COMPONENT))
> + return -1; /* Component starts with '.'. */
> + /*
> + * Even if leading dots are allowed, don't allow "."
> + * as a component (".." is prevented by a rule above).
> + */
> + if (refname[1] == '\0')
> + return -1; /* Component equals ".". */
> + }
> + if (cp - refname >= 5 && !memcmp(cp - 5, ".lock", 5))
> + return -1; /* Refname ends with ".lock". */
This is merely a moved code that retained the same comment, but it
is more like "the current refname component ends with .lock", I
suspect. In other words, we do not allow "refs/heads/foo.lock/bar".
Am I reading the patch correctly?
> +#if defined(__GNUC__) && defined(__x86_64__)
> +#define SSE_VECTOR_BYTES 16
> +
> +/* Vectorized version of check_refname_format. */
> +int check_refname_format(const char *refname, int flags)
> +{
> + const char *cp = refname;
> +
> + const __m128i dot = _mm_set1_epi8 ('.');
Style (there are other instances of the same kind). No SP between
function/macro name and opening parenthesis.
> + if (refname[0] == '.') {
> + if (refname[1] == '/' || refname[1] == '\0')
> + return -1;
> + if (!(flags & REFNAME_DOT_COMPONENT))
> + return -1;
> + }
> + while(1) {
> + __m128i tmp, tmp1, result;
> + uint64_t mask;
> +
> + if ((uintptr_t) cp % PAGE_SIZE > PAGE_SIZE - SSE_VECTOR_BYTES - 1)
OK, so we make sure we do not overrun by reading too much near the
end of the page, as the next page might be unmapped.
I am showing my ignorance but does cp (i.e. refname) upon entry to
this function need to be aligned in some way?
Thanks.
next prev parent reply other threads:[~2014-06-17 0:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-16 19:22 [PATCH v8] refs.c: SSE2 optimizations for check_refname_component David Turner
2014-06-17 0:06 ` Junio C Hamano [this message]
2014-06-17 0:51 ` David Turner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqha3ko0vs.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox.com \
--cc=dturner@twitter.com \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=mhagger@alum.mit.edu \
--cc=neleai@seznam.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.