git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH/RFC 0/6] Speed up git-grep by using PCRE v2 as a backend
@ 2017-05-11 17:51 Ævar Arnfjörð Bjarmason
  2017-05-11 17:51 ` [PATCH/RFC 1/6] Makefile & compat/pcre2: add ability to build an embedded PCRE Ævar Arnfjörð Bjarmason
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-11 17:51 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton, Michał Kiedrowicz,
	J Smith, Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Brandon Williams,
	Ævar Arnfjörð Bjarmason

Thought I'd send this to the list too. This is first of my WIP
post-PCRE v2 inclusion series's.

In addition to the huge speed improvements for grep -P noted in the
culmination of that series[1], this speeds up all other types of grep
invocations (fixed string & POSIX basic/extended) by using an
experimental PCRE API to translate those patterns to PCRE syntax.

Fixed string grep is sped up by ~15-50%, and any greps containing
regexes by 40-70%, with 50% seeming to be the average for most normal
patterns.

It isn't ready for the reasons noted in the last patch in the series,
and currently brings most of PCRE into git in compat/pcre2 since it
uses an experimental API.

The 5/6 patch is pretty much ready though and works on stock PCRE, it
fixes TODO tests for patterns that contain a \0, and enables regex
metacharacters in such patterns (right now they're all implicitly
fixed strings), see the discussion in that patch for some of the
caveats.

That patch will most likely be dropped by the list, it can be
retrieved from https://github.com/avar/git
avar/grep-and-pcre-and-more, or the whole series viewed at
https://github.com/git/git/compare/master...avar:avar/grep-and-pcre-and-more.

1. <20170511170142.15934-8-avarab@gmail.com>
   (https://public-inbox.org/git/20170511170142.15934-8-avarab@gmail.com/)

Ævar Arnfjörð Bjarmason (6):
  Makefile & compat/pcre2: add ability to build an embedded PCRE
  Makefile & compat/pcre2: add dependency on pcre2_convert.c
  compat/pcre2: import pcre2 from svn trunk
  test-lib: add LIBPCRE1 & LIBPCRE2 prerequisites
  grep: support regex patterns containing \0 via PCRE v2
  grep: use PCRE v2 under the hood for -G & -E for amazing speedup

 Makefile                                           |    53 +
 compat/pcre2/get-pcre2.sh                          |    68 +
 compat/pcre2/src/pcre2.h                           |   832 ++
 compat/pcre2/src/pcre2_auto_possess.c              |  1291 ++
 compat/pcre2/src/pcre2_chartables.c                |     1 +
 compat/pcre2/src/pcre2_chartables.c.dist           |   198 +
 compat/pcre2/src/pcre2_compile.c                   |  9626 +++++++++++++++
 compat/pcre2/src/pcre2_config.c                    |   222 +
 compat/pcre2/src/pcre2_context.c                   |   450 +
 compat/pcre2/src/pcre2_convert.c                   |   724 ++
 compat/pcre2/src/pcre2_error.c                     |   327 +
 compat/pcre2/src/pcre2_find_bracket.c              |   218 +
 compat/pcre2/src/pcre2_internal.h                  |  1967 +++
 compat/pcre2/src/pcre2_intmodedep.h                |   884 ++
 compat/pcre2/src/pcre2_jit_compile.c               | 12307 +++++++++++++++++++
 compat/pcre2/src/pcre2_jit_match.c                 |   189 +
 compat/pcre2/src/pcre2_jit_misc.c                  |   227 +
 compat/pcre2/src/pcre2_maketables.c                |   157 +
 compat/pcre2/src/pcre2_match.c                     |  6826 ++++++++++
 compat/pcre2/src/pcre2_match_data.c                |   147 +
 compat/pcre2/src/pcre2_newline.c                   |   243 +
 compat/pcre2/src/pcre2_ord2utf.c                   |   120 +
 compat/pcre2/src/pcre2_string_utils.c              |   201 +
 compat/pcre2/src/pcre2_study.c                     |  1624 +++
 compat/pcre2/src/pcre2_tables.c                    |   765 ++
 compat/pcre2/src/pcre2_ucd.c                       |  3761 ++++++
 compat/pcre2/src/pcre2_ucp.h                       |   268 +
 compat/pcre2/src/pcre2_valid_utf.c                 |   398 +
 compat/pcre2/src/pcre2_xclass.c                    |   271 +
 compat/pcre2/src/sljit/sljitConfig.h               |   145 +
 compat/pcre2/src/sljit/sljitConfigInternal.h       |   725 ++
 compat/pcre2/src/sljit/sljitExecAllocator.c        |   312 +
 compat/pcre2/src/sljit/sljitLir.c                  |  2224 ++++
 compat/pcre2/src/sljit/sljitLir.h                  |  1392 +++
 compat/pcre2/src/sljit/sljitNativeARM_32.c         |  2326 ++++
 compat/pcre2/src/sljit/sljitNativeARM_64.c         |  2104 ++++
 compat/pcre2/src/sljit/sljitNativeARM_T2_32.c      |  1987 +++
 compat/pcre2/src/sljit/sljitNativeMIPS_32.c        |   437 +
 compat/pcre2/src/sljit/sljitNativeMIPS_64.c        |   539 +
 compat/pcre2/src/sljit/sljitNativeMIPS_common.c    |  2110 ++++
 compat/pcre2/src/sljit/sljitNativePPC_32.c         |   276 +
 compat/pcre2/src/sljit/sljitNativePPC_64.c         |   447 +
 compat/pcre2/src/sljit/sljitNativePPC_common.c     |  2421 ++++
 compat/pcre2/src/sljit/sljitNativeSPARC_32.c       |   165 +
 compat/pcre2/src/sljit/sljitNativeSPARC_common.c   |  1471 +++
 compat/pcre2/src/sljit/sljitNativeTILEGX-encoder.c | 10159 +++++++++++++++
 compat/pcre2/src/sljit/sljitNativeTILEGX_64.c      |  2555 ++++
 compat/pcre2/src/sljit/sljitNativeX86_32.c         |   602 +
 compat/pcre2/src/sljit/sljitNativeX86_64.c         |   742 ++
 compat/pcre2/src/sljit/sljitNativeX86_common.c     |  2921 +++++
 compat/pcre2/src/sljit/sljitProtExecAllocator.c    |   421 +
 compat/pcre2/src/sljit/sljitUtils.c                |   334 +
 grep.c                                             |    73 +-
 grep.h                                             |     5 +
 t/README                                           |    18 +
 t/t7008-grep-binary.sh                             |    87 +-
 t/test-lib.sh                                      |     3 +
 57 files changed, 81335 insertions(+), 31 deletions(-)
 create mode 100755 compat/pcre2/get-pcre2.sh
 create mode 100644 compat/pcre2/src/pcre2.h
 create mode 100644 compat/pcre2/src/pcre2_auto_possess.c
 create mode 120000 compat/pcre2/src/pcre2_chartables.c
 create mode 100644 compat/pcre2/src/pcre2_chartables.c.dist
 create mode 100644 compat/pcre2/src/pcre2_compile.c
 create mode 100644 compat/pcre2/src/pcre2_config.c
 create mode 100644 compat/pcre2/src/pcre2_context.c
 create mode 100644 compat/pcre2/src/pcre2_convert.c
 create mode 100644 compat/pcre2/src/pcre2_error.c
 create mode 100644 compat/pcre2/src/pcre2_find_bracket.c
 create mode 100644 compat/pcre2/src/pcre2_internal.h
 create mode 100644 compat/pcre2/src/pcre2_intmodedep.h
 create mode 100644 compat/pcre2/src/pcre2_jit_compile.c
 create mode 100644 compat/pcre2/src/pcre2_jit_match.c
 create mode 100644 compat/pcre2/src/pcre2_jit_misc.c
 create mode 100644 compat/pcre2/src/pcre2_maketables.c
 create mode 100644 compat/pcre2/src/pcre2_match.c
 create mode 100644 compat/pcre2/src/pcre2_match_data.c
 create mode 100644 compat/pcre2/src/pcre2_newline.c
 create mode 100644 compat/pcre2/src/pcre2_ord2utf.c
 create mode 100644 compat/pcre2/src/pcre2_string_utils.c
 create mode 100644 compat/pcre2/src/pcre2_study.c
 create mode 100644 compat/pcre2/src/pcre2_tables.c
 create mode 100644 compat/pcre2/src/pcre2_ucd.c
 create mode 100644 compat/pcre2/src/pcre2_ucp.h
 create mode 100644 compat/pcre2/src/pcre2_valid_utf.c
 create mode 100644 compat/pcre2/src/pcre2_xclass.c
 create mode 100644 compat/pcre2/src/sljit/sljitConfig.h
 create mode 100644 compat/pcre2/src/sljit/sljitConfigInternal.h
 create mode 100644 compat/pcre2/src/sljit/sljitExecAllocator.c
 create mode 100644 compat/pcre2/src/sljit/sljitLir.c
 create mode 100644 compat/pcre2/src/sljit/sljitLir.h
 create mode 100644 compat/pcre2/src/sljit/sljitNativeARM_32.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeARM_64.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeARM_T2_32.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeMIPS_32.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeMIPS_64.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeMIPS_common.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativePPC_32.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativePPC_64.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativePPC_common.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeSPARC_32.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeSPARC_common.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeTILEGX-encoder.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeTILEGX_64.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeX86_32.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeX86_64.c
 create mode 100644 compat/pcre2/src/sljit/sljitNativeX86_common.c
 create mode 100644 compat/pcre2/src/sljit/sljitProtExecAllocator.c
 create mode 100644 compat/pcre2/src/sljit/sljitUtils.c

-- 
2.11.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-05-11 17:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-11 17:51 [PATCH/RFC 0/6] Speed up git-grep by using PCRE v2 as a backend Ævar Arnfjörð Bjarmason
2017-05-11 17:51 ` [PATCH/RFC 1/6] Makefile & compat/pcre2: add ability to build an embedded PCRE Ævar Arnfjörð Bjarmason
2017-05-11 17:51 ` [PATCH/RFC 2/6] Makefile & compat/pcre2: add dependency on pcre2_convert.c Ævar Arnfjörð Bjarmason
2017-05-11 17:51 ` [PATCH/RFC 4/6] test-lib: add LIBPCRE1 & LIBPCRE2 prerequisites Ævar Arnfjörð Bjarmason
2017-05-11 17:51 ` [PATCH/RFC 5/6] grep: support regex patterns containing \0 via PCRE v2 Ævar Arnfjörð Bjarmason
2017-05-11 17:51 ` [PATCH/RFC 6/6] grep: use PCRE v2 under the hood for -G & -E for amazing speedup Ævar Arnfjörð Bjarmason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).