* git grep -E doesn't accept \b word boundaries?
@ 2023-05-03 19:04 Kevin Ushey
2023-05-03 19:35 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Kevin Ushey @ 2023-05-03 19:04 UTC (permalink / raw)
To: git
Hello,
I'm seeing the following, which I believe is unexpected. I have a file
with contents:
$ cat hello.txt
WholeWord
Whole Word
Whole
I can use `git grep` to search with word boundaries; e.g.
$ git grep --untracked '\bWhole\b'
hello.txt:Whole Word
hello.txt:Whole
However, if I add `-E` to use extended regular expressions, the same
invocation finds no search results.
$ git grep --untracked -E '\bWhole\b'
This does seem to work as expected with the '-w' flag, e.g.
$ git grep --untracked -E -w 'Whole'
hello.txt:Whole Word
hello.txt:Whole
as well as with POSIX word boundaries, e.g.
$ git grep --untracked -E '[[:<:]]Whole[[:>:]]'
hello.txt:Whole Word
hello.txt:Whole
Is this a bug, or am I misunderstanding some behavior in `git grep`?
For posterity:
$ git grep --untracked -G '\bWhole\b'
hello.txt:Whole Word
hello.txt:Whole
$ git grep --untracked -E '\bWhole\b'
$ git grep --untracked -P '\bWhole\b'
hello.txt:Whole Word
hello.txt:Whole
For what it's worth, I don't see this issue with an older version of
`git` on an Ubuntu 22.04 VM:
root@96722b73f316:~/test# git --version
git version 2.34.1
root@96722b73f316:~/test# git grep --untracked -E '\bWhole\b'
hello.txt:Whole Word
hello.txt:Whole
Thanks,
Kevin
------
[System Info]
git version:
git version 2.40.1
cpu: arm64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
feature: fsmonitor--daemon
uname: Darwin 22.4.0 Darwin Kernel Version 22.4.0: Mon Mar 6 20:59:28
PST 2023; root:xnu-8796.101.5~3/RELEASE_ARM64_T6000 arm64
compiler info: clang: 14.0.3 (clang-1403.0.22.14.1)
libc info: no libc information available
$SHELL (typically, interactive shell): /opt/homebrew/bin/bash
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git grep -E doesn't accept \b word boundaries?
2023-05-03 19:04 git grep -E doesn't accept \b word boundaries? Kevin Ushey
@ 2023-05-03 19:35 ` Junio C Hamano
2023-05-03 20:32 ` Kevin Ushey
0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2023-05-03 19:35 UTC (permalink / raw)
To: Kevin Ushey; +Cc: git
Kevin Ushey <kevinushey@gmail.com> writes:
> I'm seeing the following, which I believe is unexpected. I have a file
> with contents:
>
> $ cat hello.txt
> WholeWord
> Whole Word
> Whole
>
> I can use `git grep` to search with word boundaries; e.g.
>
> $ git grep --untracked '\bWhole\b'
> hello.txt:Whole Word
> hello.txt:Whole
>
> However, if I add `-E` to use extended regular expressions, the same
> invocation finds no search results.
>
> $ git grep --untracked -E '\bWhole\b'
Does not seem to reproduce for me. In a randomly picked repository
(the source to git itself), I did
$ cat >hello.txt
WholeWord
Whole Word
Whole
^D
and "git grep --untracked -E '\bWhole\b' hello.txt" with or without
the "-E" option shows the same two lines as hits.
Without the pathspec hello.txt, the output includes one line from
unpack-trees.c as well, but the hits from the untracked hello.txt
are the same.
The tip of 'master', v2.40.0, v2.38.4, v2.37.4, v2.35.4 (they are by
no means significant milestones---just some random versions I picked
to test) all behave the same way.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git grep -E doesn't accept \b word boundaries?
2023-05-03 19:35 ` Junio C Hamano
@ 2023-05-03 20:32 ` Kevin Ushey
2023-05-03 20:45 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Kevin Ushey @ 2023-05-03 20:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Thanks for the quick response! I wonder if this issue could be macOS-specific?
I just tried building git from sources, and I was able to reproduce
the issue with 2.39.3:
$ ./git --version
git version 2.39.3
$ ./git grep -E '\bupdate\b'
But everything works okay for me with 2.38.5:
$ ./git --version
git version 2.38.5
kevin@MBP-P2MQ:~/projects/git [(HEAD detached at v2.38.5)]
$ ./git grep -E '\bupdate\b'
.github/workflows/l10n.yml: sudo apt-get update -q &&
.gitignore:/git-update-index
.gitignore:/git-update-ref
< ... etc ...>
I see this bit in the release notes, which seems potentially related:
https://github.com/git/git/blob/69c786637d7a7fe3b2b8f7d989af095f5f49c3a8/Documentation/RelNotes/2.39.0.txt#L64-L65
And indeed, I can't reproduce the issue if I compile git 2.39.3 with
'make NO_REGEX=1'. So, perhaps a difference between git's compat regex
library and the one provided by macOS?
Thanks,
Kevin
On Wed, May 3, 2023 at 12:35 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Kevin Ushey <kevinushey@gmail.com> writes:
>
> > I'm seeing the following, which I believe is unexpected. I have a file
> > with contents:
> >
> > $ cat hello.txt
> > WholeWord
> > Whole Word
> > Whole
> >
> > I can use `git grep` to search with word boundaries; e.g.
> >
> > $ git grep --untracked '\bWhole\b'
> > hello.txt:Whole Word
> > hello.txt:Whole
> >
> > However, if I add `-E` to use extended regular expressions, the same
> > invocation finds no search results.
> >
> > $ git grep --untracked -E '\bWhole\b'
>
> Does not seem to reproduce for me. In a randomly picked repository
> (the source to git itself), I did
>
> $ cat >hello.txt
> WholeWord
> Whole Word
> Whole
> ^D
>
> and "git grep --untracked -E '\bWhole\b' hello.txt" with or without
> the "-E" option shows the same two lines as hits.
>
> Without the pathspec hello.txt, the output includes one line from
> unpack-trees.c as well, but the hits from the untracked hello.txt
> are the same.
>
> The tip of 'master', v2.40.0, v2.38.4, v2.37.4, v2.35.4 (they are by
> no means significant milestones---just some random versions I picked
> to test) all behave the same way.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git grep -E doesn't accept \b word boundaries?
2023-05-03 20:32 ` Kevin Ushey
@ 2023-05-03 20:45 ` Junio C Hamano
0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2023-05-03 20:45 UTC (permalink / raw)
To: Kevin Ushey; +Cc: git
Kevin Ushey <kevinushey@gmail.com> writes:
> Thanks for the quick response! I wonder if this issue could be macOS-specific?
Ah, yes, I somehow thought you mentioned Ubuntu and totally blocked
that macOS issue out of my mind, but I do recall that it has been
reported that build with macOS native regexp library is broken a few
times recently on this list.
https://lore.kernel.org/git/?q=macOS+regexp
finds this thread, which unfortunately was mistitled to make them
sound as if they were about "-P", but the issue in the thread was
about extended regexp.
https://lore.kernel.org/git/03fd7ddb-8241-1a0a-3e82-d8083e4ce0f7@web.de/
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-05-03 20:45 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-03 19:04 git grep -E doesn't accept \b word boundaries? Kevin Ushey
2023-05-03 19:35 ` Junio C Hamano
2023-05-03 20:32 ` Kevin Ushey
2023-05-03 20:45 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).