From: Junio C Hamano <gitster@pobox.com>
To: Benjamin Hiller <benhiller@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: git grep performance regression on macOS
Date: Fri, 29 Sep 2023 22:45:58 -0700 [thread overview]
Message-ID: <xmqqa5t4gqnd.fsf@gitster.g> (raw)
In-Reply-To: <CAPWWTaDE5559vA1qa0zhBid_ep9ht+PxPSDS5YC7Dk0NN8sp9A@mail.gmail.com> (Benjamin Hiller's message of "Fri, 29 Sep 2023 16:56:19 -0700")
Benjamin Hiller <benhiller@gmail.com> writes:
> git grep seems to have gotten much slower as of git 2.39 on macOS for
> complex extended regexes.
> I confirmed that the performance regression was first introduced in
> 2.39. Additionally, I saw that reverting the change to Makefile from
> https://github.com/git/git/commit/1819ad327b7a1f19540a819813b70a0e8a7f798f
> fixed the performance regression and the git grep command went back to
> taking <1 second. That seems to indicate that switching from Git's
> regex library to the native macOS regex library caused this
> performance regression, but I haven't investigated beyond that to see
> why the native macOS regex library is so much slower.
Yeah, that does sound a plausible explanation.
The regexp code in compat/ is meant as a fallback implementation for
platforms whose regexp library lack certain features we take
advantage of, but it has a limitation that it is not unicode aware.
In the olden days, regexp library on macOS lacked REG_STARTEND
feature, which forced us to use NO_REGEX (hence the fallback
implementation we ship that is not unicode aware). The commit you
cite makes us use the macOS native regexp library, as somebody on
the platform got annoyed enough by the lack of unicode awareness of
the fallback implementation, and also noticed that REG_STARTEND is
supported by the macOS native regexp library these days.
The change in 2.39 was unfortunately about correctness. It would
have been nicer if macOS native implementation were faster, but use
of fallback implementation would be favoring "performance" (which
produces incorrect results "faster" when run with multi-byte
strings) over correctness, so it is not likely that a straight
reverting of the commit is a good idea.
next prev parent reply other threads:[~2023-09-30 5:46 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-29 23:56 git grep performance regression on macOS Benjamin Hiller
2023-09-30 5:45 ` Junio C Hamano [this message]
2023-10-02 3:05 ` Carlo Marcelo Arenas Bel'on
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqa5t4gqnd.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=benhiller@gmail.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).