All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: David Turner <dturner@twopensource.com>
Cc: git@vger.kernel.org, mhagger@alum.mit.edu,
	David Turner <dturner@twitter.com>
Subject: Re: [PATCH] check_refname_component: Optimize
Date: Wed, 28 May 2014 14:24:07 -0700	[thread overview]
Message-ID: <xmqq38ftd2bs.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <1401307055-11603-1-git-send-email-dturner@twitter.com> (David Turner's message of "Wed, 28 May 2014 15:57:35 -0400")

David Turner <dturner@twopensource.com> writes:

> In a repository with tens of thousands of refs, the command
> ~/git/git-diff-index --cached --quiet --ignore-submodules [revision]
> is a bit slow.  check_refname_component is a major contributor to the
> runtime.  Provide two optimized versions of this function: one for
> machines with SSE4.2, and one for machines without.  The speedup for
> this command on one particular repo (with about 60k refs) is about 12%
> for the SSE version and 8% for the non-SSE version.
>
> Signed-off-by: David Turner <dturner@twitter.com>

Just a few quick impressions (I do not have time today to look at
new patches).

 - The title seems a bit strange.
   Perhaps "refs.c: optimize check_refname_component()" or something?

 - "~/git/git-diff-index" looks doubly strange in that the issue you
   are addressing would not depend on your compiled Git being
   installed in your $HOME/git at all.  For that matter, from the
   command line and the patch, I am not sure if this is specific to
   "git diff-index", or the same issue exists for anything that
   takes revs and pathspecs from the command line.





> ---
>  Makefile           |   6 +++
>  configure.ac       |   6 +++
>  refs.c             | 143 +++++++++++++++++++++++++++++++++++++++++++++++++++--
>  t/t5511-refspec.sh |  13 +++++
>  4 files changed, 163 insertions(+), 5 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index a53f3a8..123e2fc 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1326,6 +1326,11 @@ else
>  		COMPAT_OBJS += compat/win32mmap.o
>  	endif
>  endif
> +ifdef NO_SSE
> +	BASIC_CFLAGS += -DNO_SSE
> +else
> +	BASIC_CFLAGS += -msse4
> +endif
>  ifdef OBJECT_CREATION_USES_RENAMES
>  	COMPAT_CFLAGS += -DOBJECT_CREATION_MODE=1
>  endif
> @@ -2199,6 +2204,7 @@ GIT-BUILD-OPTIONS: FORCE
>  	@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@
>  	@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@
>  	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@
> +	@echo NO_SSE=\''$(subst ','\'',$(subst ','\'',$(NO_SSE)))'\' >>$@
>  ifdef TEST_OUTPUT_DIRECTORY
>  	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@
>  endif
> diff --git a/configure.ac b/configure.ac
> index b711254..71a9429 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -382,6 +382,11 @@ AS_HELP_STRING([],[Tcl/Tk interpreter will be found in a system.]),
>  GIT_PARSE_WITH(tcltk))
>  #
>  
> +# Declare the with-sse/without-sse options.
> +AC_ARG_WITH(sse,
> +AS_HELP_STRING([--with-sse],[use SSE instructions (default is YES)]),
> +GIT_PARSE_WITH(sse))
> +
>  
>  ## Checks for programs.
>  AC_MSG_NOTICE([CHECKS for programs])
> @@ -449,6 +454,7 @@ else
>    fi
>  fi
>  GIT_CONF_SUBST([TCLTK_PATH])
> +GIT_CONF_SUBST([NO_SSE])
>  AC_CHECK_PROGS(ASCIIDOC, [asciidoc])
>  if test -n "$ASCIIDOC"; then
>  	AC_MSG_CHECKING([for asciidoc version])
> diff --git a/refs.c b/refs.c
> index 28d5eca..8ca124c 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -5,6 +5,8 @@
>  #include "dir.h"
>  #include "string-list.h"
>  
> +#include <nmmintrin.h>
> +

We would prefer not to add inclusion of any system header files in
random *.c files, as there often are system dependencies (order of
inclusion, definition of feature macros, etc.) we would rather want
to encapsulate in one place, that is git-compat-util.h.

>  /*
>   * Make sure "ref" is something reasonable to have under ".git/refs/";
>   * We do not like it if:
> @@ -29,30 +31,160 @@ static inline int bad_ref_char(int ch)
>  	return 0;
>  }
>  
> +char refname_disposition[] = {
> +       1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> +       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> ...
> +       9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9
> +};
> +

Does this array need to be extern?

Is this table designed to take both positive and negative values?
If the answer is "I expect we add only positive", then please make
it explicitly "unsigned char".

What do these magic numbers in the array mean?

How were the values derived?  What are you doing in this commit to
help others who later need to debug, fix and enhance this table?

  reply	other threads:[~2014-05-28 21:24 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-28 19:57 [PATCH] check_refname_component: Optimize David Turner
2014-05-28 21:24 ` Junio C Hamano [this message]
2014-05-29 12:19 ` brian m. carlson
  -- strict thread matches above, loose matches on Subject: below --
2014-05-28 21:04 [PATCH v2 0/1] " David Turner
2014-05-28 21:04 ` [PATCH] " David Turner
2014-05-28 21:44   ` Michael Haggerty
2014-05-28 23:49     ` David Turner
2014-05-29 13:41       ` Duy Nguyen
2014-05-29 16:36         ` Junio C Hamano
2014-05-29 23:24           ` Duy Nguyen
2014-05-29 23:41             ` Jeff King
2014-05-29 23:43               ` Duy Nguyen
2014-05-30  0:07                 ` Jeff King
2014-05-30  2:03                   ` Duy Nguyen
2014-05-30  9:47                   ` Michael Haggerty
2014-05-30 17:29                     ` Jeff King
2014-05-31 10:47                       ` Michael Haggerty
2014-05-31 11:21                   ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq38ftd2bs.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=dturner@twitter.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.