git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] perf: show that wildmatch() regressed for pathological cases in v2.0
@ 2017-04-24 21:12 Ævar Arnfjörð Bjarmason
  2017-04-24 21:12 ` [PATCH 1/2] perf: add function to setup a fresh test repo Ævar Arnfjörð Bjarmason
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-04-24 21:12 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy,
	Ævar Arnfjörð Bjarmason

Russ Cox just published an article about how various glob()
implementations suffer from pathological performance when fed certain
pathological patterns like "a*a*a*a*b" given a file like "aaaaaaa...":
https://research.swtch.com/glob

I was curious to see if this impacted git. It turns out it does. This
used to be a per-platform issue with git, since globbing was
implemented via fnmatch() by default before v1.8.4, and support for
using the OS fnmatch() was removed entirely in v2.0.0.

This performance test shows the regression:

    $ GIT_PERF_REPEAT_COUNT=1 GIT_PERF_MAKE_OPTS="[...]NO_WILDMATCH=YesPlease[...]" ./run v1.9.5 v2.0.0 v2.12.0 p0100-globbing.sh
    [...]
    Test                                                       v1.9.5            v2.0.0                    v2.12.0
    ------------------------------------------------------------------------------------------------------------------------------
    [...]
    0100.7: fileglob((a*)^nb) against file (a^100).t; n = 1    0.01(0.00+0.00)   0.00(0.00+0.00) -100.0%   0.01(0.00+0.00) +0.0%
    0100.8: fileglob((a*)^nb) against file (a^100).t; n = 2    0.01(0.00+0.00)   0.00(0.00+0.00) -100.0%   0.01(0.00+0.00) +0.0%
    0100.9: fileglob((a*)^nb) against file (a^100).t; n = 3    0.00(0.00+0.00)   0.00(0.00+0.00) =         0.01(0.00+0.00) +inf
    0100.10: fileglob((a*)^nb) against file (a^100).t; n = 4   0.00(0.00+0.00)   0.01(0.01+0.00) +inf      0.02(0.02+0.00) +inf
    0100.11: fileglob((a*)^nb) against file (a^100).t; n = 5   0.00(0.00+0.00)   0.20(0.19+0.00) +inf      0.24(0.23+0.00) +inf
    0100.12: fileglob((a*)^nb) against file (a^100).t; n = 6   0.00(0.00+0.00)   3.03(3.00+0.00) +inf      3.08(3.05+0.00) +inf

And here's a one-liner to do the same:

    $ time (rm -rf test; git init -q test && (cd test && touch $(perl -e 'print "a" x 100').t && git add a* && git commit -q -m"test" && git ls-files 'a*a*a*a*a*a*a*b'))

Add or remove "a*"'s to adjust the runtime. With 6 that executes in 3
seconds on my system, 40 seconds with 7 etc.

I don't think this is something we need to worry much about, if you
have a file like this and feed Git insane patterns you probably
deserve what you get.

The real concern is if we have behavior like this and ever e.g. expose
globbing over the network, e.g. in some future upload-pack protocol.

There are probably some web-based programs built on top of git that
are vulnerable to DoS attacks as a result of this, e.g. if they take
user-supplied globs and feed them to ls-files.

Ævar Arnfjörð Bjarmason (2):
  perf: add function to setup a fresh test repo
  perf: add test showing exponential growth in path globbing

 t/perf/README            |  1 +
 t/perf/p0100-globbing.sh | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 t/perf/perf-lib.sh       | 17 +++++++++++++----
 3 files changed, 62 insertions(+), 4 deletions(-)
 create mode 100755 t/perf/p0100-globbing.sh

-- 
2.11.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-04-24 22:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-24 21:12 [PATCH 0/2] perf: show that wildmatch() regressed for pathological cases in v2.0 Ævar Arnfjörð Bjarmason
2017-04-24 21:12 ` [PATCH 1/2] perf: add function to setup a fresh test repo Ævar Arnfjörð Bjarmason
2017-04-24 21:12 ` [PATCH 2/2] perf: add test showing exponential growth in path globbing Ævar Arnfjörð Bjarmason
2017-04-24 21:34 ` [PATCH 0/2] perf: show that wildmatch() regressed for pathological cases in v2.0 Brandon Williams
2017-04-24 22:45   ` Ævar Arnfjörð Bjarmason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).