git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* regex compilation error with --color-words
@ 2023-03-29 22:55 Eric Sunshine
  2023-03-30  7:55 ` Diomidis Spinellis
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Sunshine @ 2023-03-29 22:55 UTC (permalink / raw)
  To: Git List, Diomidis Spinellis
  Cc: René Scharfe, Junio C Hamano, demerphq, Mario Grgic

I'm encountering a failure on macOS High Sierra 10.13.6 when using
--color-words:

    % git show --color-words HEAD
    fatal: invalid regular expression:
[[:alpha:]_'][[:alnum:]_']*|0[xb]?[0-9a-fA-F_]*|[0-9a-fA-F_]+(\.[0-9a-fA-F_]+)?([eE][-+]?[0-9_]+)?|=>|-[rwxoRWXOezsfdlpSugkbctTBMAC>]|~~|::|&&=|\|\|=|//=|\*\*=|&&|\|\||//|\+\+|--|\*\*|\.\.\.?|[-+*/%.^&<>=!|]=|=~|!~|<<|<>|<=>|>>|[^[:space:]]|[<C0>-<FF>][<80>-<BF>]+

This crash happens when viewing the commit I sent to Peff today[1],
though it doesn't happen with all commits. The problem bisects to:

    Author: Diomidis Spinellis <dds@aueb.gr>
    Date:   Fri Aug 26 11:58:15 2022 +0300

    grep: fix multibyte regex handling under macOS

    The commit 29de20504e (Makefile: fix default regex settings on
    Darwin, 2013-05-11) fixed t0070-fundamental.sh under Darwin (macOS) by
    adopting Git's regex library.  However, this library is compiled with
    NO_MBSUPPORT, which causes git-grep to work incorrectly on multibyte
    (e.g. UTF-8) files.  Current macOS versions pass t0070-fundamental.sh
    with the native macOS regex library, which also supports multibyte
    characters.

    Adjust the Makefile to use the native regex library, and call
    setlocale(3) to set CTYPE according to the user's preference.
    The setlocale call is required on all platforms, but in platforms
    supporting gettext(3), setlocale was called as a side-effect of
    initializing gettext.  Therefore, move the CTYPE setlocale call from
    gettext.c to common-main.c and the corresponding locale.h include
    into git-compat-util.h.

    Thanks to the global initialization of CTYPE setlocale, the test-tool
    regex command now works correctly with supported multibyte regexes, and
    is used to set the MB_REGEX test prerequisite by assessing a platform's
    support for them.

    Signed-off-by: Diomidis Spinellis <dds@aueb.gr>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>

I see that this same commit is also the subject of another bug report
currently being discussed[2], so I've Cc:'d the participants of that
thread, as well.

Any pointers aimed at getting this resolved would be appreciated.

[1]: https://lore.kernel.org/git/CAPig+cQiOGrDSUc34jHEBp87Rx-dnXNcPcF76bu0SJoOzD+1hw@mail.gmail.com/
[2]: https://lore.kernel.org/git/MW4PR20MB5517583CBEEF34B1E87CCF1290859@MW4PR20MB5517.namprd20.prod.outlook.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-04-07 17:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-29 22:55 regex compilation error with --color-words Eric Sunshine
2023-03-30  7:55 ` Diomidis Spinellis
2023-03-31 20:44   ` René Scharfe
2023-04-02  9:44     ` René Scharfe
2023-04-03 16:29       ` Junio C Hamano
2023-04-03 19:32         ` René Scharfe
2023-04-06 20:19           ` [PATCH] userdiff: support regexec(3) with multi-byte support René Scharfe
2023-04-06 22:35             ` Johannes Sixt
2023-04-07  7:49               ` René Scharfe
2023-04-07 10:56                 ` Johannes Sixt
2023-04-07 14:41             ` D. Ben Knoble
2023-04-07 16:02               ` Junio C Hamano
2023-04-07 17:23             ` Eric Sunshine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).