* [PATCH 0/5] Speed up string search routines
@ 2010-02-13 14:20 Fredrik Kuivinen
2010-02-13 18:52 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Fredrik Kuivinen @ 2010-02-13 14:20 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano
This series speeds up git grep and pickaxe by using the string search
routines from GNU grep.
---
Fredrik Kuivinen (5):
Use kwset in grep
Use kwset in pickaxe
Adapt the kwset code to Git
Add string search routines from GNU grep
Add obstack.[ch] from EGLIBC 2.10
Makefile | 2
diffcore-pickaxe.c | 34 ++
grep.c | 61 +++-
grep.h | 2
kwset.c | 775 ++++++++++++++++++++++++++++++++++++++++++++++++++++
kwset.h | 62 ++++
obstack.c | 441 ++++++++++++++++++++++++++++++
obstack.h | 509 ++++++++++++++++++++++++++++++++++
8 files changed, 1855 insertions(+), 31 deletions(-)
create mode 100644 kwset.c
create mode 100644 kwset.h
create mode 100644 obstack.c
create mode 100644 obstack.h
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/5] Speed up string search routines
2010-02-13 14:20 [PATCH 0/5] Speed up string search routines Fredrik Kuivinen
@ 2010-02-13 18:52 ` Junio C Hamano
2010-02-14 16:47 ` Fredrik Kuivinen
0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2010-02-13 18:52 UTC (permalink / raw)
To: Fredrik Kuivinen; +Cc: git
Fredrik Kuivinen <frekui@gmail.com> writes:
> This series speeds up git grep and pickaxe by using the string search
> routines from GNU grep.
Thanks.
It needs to be a bit more friendly to readers of "git log" and
ReleaseNotes by hinting why use of kwset is beneficial (e.g. "use kwset
instead of memmem to find fixed string more efficiently") in the commit
titles.
The preference of using GPLv2 version was already mentioned by a few
people.
Shouldn't obstack.[ch] be in compat/ so that people on platforms where
they are natively available do not have to compile our own copies?
It is somewhat curious that you gave numbers for only negative case in
pickaxe test and numbers for only positive case in grep test. Does this
conversion have some interesting performance charasteristics such as
penalizing positive-match case to speed up negative-match case or vice
versa (the earlier "grep lookahead" work had that effect, even though the
downside was really small)?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/5] Speed up string search routines
2010-02-13 18:52 ` Junio C Hamano
@ 2010-02-14 16:47 ` Fredrik Kuivinen
2010-02-14 20:16 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Fredrik Kuivinen @ 2010-02-14 16:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Sat, Feb 13, 2010 at 19:52, Junio C Hamano <gitster@pobox.com> wrote:
> Fredrik Kuivinen <frekui@gmail.com> writes:
>
> It needs to be a bit more friendly to readers of "git log" and
> ReleaseNotes by hinting why use of kwset is beneficial (e.g. "use kwset
> instead of memmem to find fixed string more efficiently") in the commit
> titles.
Will fix in the next iteration.
> Shouldn't obstack.[ch] be in compat/ so that people on platforms where
> they are natively available do not have to compile our own copies?
There is code in obstack.c to check if we are using gnu libc or not.
If gnu libc is used, then ELIDE_CODE is defined and no code from
obstack.c is compiled.
> It is somewhat curious that you gave numbers for only negative case in
> pickaxe test and numbers for only positive case in grep test. Does this
> conversion have some interesting performance charasteristics such as
> penalizing positive-match case to speed up negative-match case or vice
> versa (the earlier "grep lookahead" work had that effect, even though the
> downside was really small)?
I did some more benchmarking. In the extreme case when we are looking
for ' ' (i.e., a single space) with pickaxe the new code is actually
slightly slower than the old one.
before:
$ time git log -S' ' > /dev/null
real 0m32.908s
user 0m32.258s
sys 0m0.652s
after:
$ time ./git-log -S' ' > /dev/null
real 0m34.072s
user 0m33.418s
sys 0m0.656s
However, with longer strings the new code wins (the new code wins when
we are searching for two spaces).
grep gets a significant performance increase for all strings I have
tried, it doesn't matter if there are no matches or a lot of matches.
Thanks for the comments.
- Fredrik
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/5] Speed up string search routines
2010-02-14 16:47 ` Fredrik Kuivinen
@ 2010-02-14 20:16 ` Junio C Hamano
0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2010-02-14 20:16 UTC (permalink / raw)
To: Fredrik Kuivinen; +Cc: git
Fredrik Kuivinen <frekui@gmail.com> writes:
> There is code in obstack.c to check if we are using gnu libc or not.
> If gnu libc is used, then ELIDE_CODE is defined and no code from
> obstack.c is compiled.
Thanks; that explains why there is no need to change Makefile to customize
where obstack is taken from.
But I am not a big fan of keeping borrowed code mixed together in the same
directory with our own code. For something as large as xdiff, giving it
its own directory made sense, but for only two files, I thought compat/
would be a good place, hence my suggesiton. kwset would probably want to
go together with them somewhere.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-02-14 20:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-13 14:20 [PATCH 0/5] Speed up string search routines Fredrik Kuivinen
2010-02-13 18:52 ` Junio C Hamano
2010-02-14 16:47 ` Fredrik Kuivinen
2010-02-14 20:16 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).