From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH 0/2] perf: show that wildmatch() regressed for pathological cases in v2.0
Date: Mon, 24 Apr 2017 21:12:47 +0000 [thread overview]
Message-ID: <20170424211249.28553-1-avarab@gmail.com> (raw)
Russ Cox just published an article about how various glob()
implementations suffer from pathological performance when fed certain
pathological patterns like "a*a*a*a*b" given a file like "aaaaaaa...":
https://research.swtch.com/glob
I was curious to see if this impacted git. It turns out it does. This
used to be a per-platform issue with git, since globbing was
implemented via fnmatch() by default before v1.8.4, and support for
using the OS fnmatch() was removed entirely in v2.0.0.
This performance test shows the regression:
$ GIT_PERF_REPEAT_COUNT=1 GIT_PERF_MAKE_OPTS="[...]NO_WILDMATCH=YesPlease[...]" ./run v1.9.5 v2.0.0 v2.12.0 p0100-globbing.sh
[...]
Test v1.9.5 v2.0.0 v2.12.0
------------------------------------------------------------------------------------------------------------------------------
[...]
0100.7: fileglob((a*)^nb) against file (a^100).t; n = 1 0.01(0.00+0.00) 0.00(0.00+0.00) -100.0% 0.01(0.00+0.00) +0.0%
0100.8: fileglob((a*)^nb) against file (a^100).t; n = 2 0.01(0.00+0.00) 0.00(0.00+0.00) -100.0% 0.01(0.00+0.00) +0.0%
0100.9: fileglob((a*)^nb) against file (a^100).t; n = 3 0.00(0.00+0.00) 0.00(0.00+0.00) = 0.01(0.00+0.00) +inf
0100.10: fileglob((a*)^nb) against file (a^100).t; n = 4 0.00(0.00+0.00) 0.01(0.01+0.00) +inf 0.02(0.02+0.00) +inf
0100.11: fileglob((a*)^nb) against file (a^100).t; n = 5 0.00(0.00+0.00) 0.20(0.19+0.00) +inf 0.24(0.23+0.00) +inf
0100.12: fileglob((a*)^nb) against file (a^100).t; n = 6 0.00(0.00+0.00) 3.03(3.00+0.00) +inf 3.08(3.05+0.00) +inf
And here's a one-liner to do the same:
$ time (rm -rf test; git init -q test && (cd test && touch $(perl -e 'print "a" x 100').t && git add a* && git commit -q -m"test" && git ls-files 'a*a*a*a*a*a*a*b'))
Add or remove "a*"'s to adjust the runtime. With 6 that executes in 3
seconds on my system, 40 seconds with 7 etc.
I don't think this is something we need to worry much about, if you
have a file like this and feed Git insane patterns you probably
deserve what you get.
The real concern is if we have behavior like this and ever e.g. expose
globbing over the network, e.g. in some future upload-pack protocol.
There are probably some web-based programs built on top of git that
are vulnerable to DoS attacks as a result of this, e.g. if they take
user-supplied globs and feed them to ls-files.
Ævar Arnfjörð Bjarmason (2):
perf: add function to setup a fresh test repo
perf: add test showing exponential growth in path globbing
t/perf/README | 1 +
t/perf/p0100-globbing.sh | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
t/perf/perf-lib.sh | 17 +++++++++++++----
3 files changed, 62 insertions(+), 4 deletions(-)
create mode 100755 t/perf/p0100-globbing.sh
--
2.11.0
next reply other threads:[~2017-04-24 21:13 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-24 21:12 Ævar Arnfjörð Bjarmason [this message]
2017-04-24 21:12 ` [PATCH 1/2] perf: add function to setup a fresh test repo Ævar Arnfjörð Bjarmason
2017-04-24 21:12 ` [PATCH 2/2] perf: add test showing exponential growth in path globbing Ævar Arnfjörð Bjarmason
2017-04-24 21:34 ` [PATCH 0/2] perf: show that wildmatch() regressed for pathological cases in v2.0 Brandon Williams
2017-04-24 22:45 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170424211249.28553-1-avarab@gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).