Git development
 help / color / mirror / Atom feed
* git grep bug with --column and --only-matching
@ 2026-04-21  5:03 Brandon Chinn
  2026-04-21 20:33 ` René Scharfe
  2026-04-24 21:04 ` [PATCH] grep: fix --column --only-match for 2nd and later matches René Scharfe
  0 siblings, 2 replies; 5+ messages in thread
From: Brandon Chinn @ 2026-04-21  5:03 UTC (permalink / raw)
  To: git

I'm encountering a bug when using `git grep` with both `--column` and
`--only-matching`, is this a known limitation?

Repro:

```
$ echo 'x   x   x' > repro.txt

$ grep -bo x repro.txt
0:x
4:x
8:x

$ git grep --no-index -o -n --column x repro.txt
repro.txt:1:  1:x
repro.txt:1:  2:x
repro.txt:1:  6:x
```

[System Info]
git version:
git version 2.51.1
cpu: arm64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
feature: fsmonitor--daemon
libcurl: 8.7.1
zlib: 1.2.12
SHA-1: SHA1_DC
SHA-256: SHA256_BLK
default-ref-format: files
default-hash: sha1
uname: Darwin 25.3.0 Darwin Kernel Version 25.3.0: Wed Jan 28 20:56:34
PST 2026; root:xnu-12377.91.3~2/RELEASE_ARM64_T8112 arm64
compiler info: clang: 17.0.0 (clang-1700.0.13.3)
libc info: no libc information available
$SHELL (typically, interactive shell): /bin/zsh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: git grep bug with --column and --only-matching
  2026-04-21  5:03 git grep bug with --column and --only-matching Brandon Chinn
@ 2026-04-21 20:33 ` René Scharfe
  2026-04-23  9:44   ` Phillip Wood
  2026-04-24 21:04 ` [PATCH] grep: fix --column --only-match for 2nd and later matches René Scharfe
  1 sibling, 1 reply; 5+ messages in thread
From: René Scharfe @ 2026-04-21 20:33 UTC (permalink / raw)
  To: Brandon Chinn; +Cc: git, Taylor Blau, Jeff King, Junio C Hamano

On 4/21/26 7:03 AM, Brandon Chinn wrote:
> I'm encountering a bug when using `git grep` with both `--column` and
> `--only-matching`, is this a known limitation?
> 
> Repro:
> 
> ```
> $ echo 'x   x   x' > repro.txt
> 
> $ grep -bo x repro.txt
> 0:x
> 4:x
> 8:x
> 
> $ git grep --no-index -o -n --column x repro.txt
> repro.txt:1:  1:x
> repro.txt:1:  2:x
> repro.txt:1:  6:x

The first column value matches (1-based for git grep, 0-based for
grep(1)), the remaining ones are different.  grep(1) shows the offset
like for the first match, but git grep adds the relative position of the
end of the previous match.

I don't know how to use the resulting numbers and I agree that it seems
like a bug -- showing the column of each match within its line makes
more sense for an option named --column.

However, the original submission of this feature in
https://lore.kernel.org/git/cover.1529961706.git.me@ttaylorr.com/
gave similar examples in its commit message and its tests, called
them "as one would expect" and was not challenged on that, so I may be
missing something.

Here's its first example with underlined matches, a ruler and the
intended output (without unnecessary headers):

  (`man gitcvs-migration` or `git help cvs-migration` if git is
        ^^^                   ^^^                        ^^^
           1111111111222222222233333333334444444444555555555566  
  123456789 123456789 123456789 123456789 123456789 123456789 1
           
  7:git
  16:git
  38:git

And here's the last line of the test file from t7810 with underlined
matches (the other four lines are very similar), a ruler and the
expected output (unncessary headers removed):

  foo_mmap bar mmap baz
      ^^^^     ^^^^
           111111111122
  123456789 123456789 1

  5:mmap
  13:mmap

If we wanted to show the column of matches 2 and beyond then we could do
something like this:


diff --git a/grep.c b/grep.c
index c7e1dc1e0e..a54e5d86a9 100644
--- a/grep.c
+++ b/grep.c
@@ -1267,6 +1267,7 @@ static void show_line(struct grep_opt *opt,
 		regmatch_t match;
 		enum grep_context ctx = GREP_CONTEXT_BODY;
 		int eflags = 0;
+		const char *start = bol;
 
 		if (want_color(opt->color)) {
 			if (sign == ':')
@@ -1285,6 +1286,7 @@ static void show_line(struct grep_opt *opt,
 			if (match.rm_so == match.rm_eo)
 				break;
 
+			cno = bol - start + match.rm_so + 1;
 			if (opt->only_matching)
 				show_line_header(opt, name, lno, cno, sign);
 			else
@@ -1294,7 +1296,6 @@ static void show_line(struct grep_opt *opt,
 			if (opt->only_matching)
 				opt->output(opt, "\n", 1);
 			bol += match.rm_eo;
-			cno += match.rm_eo;
 			rest -= match.rm_eo;
 			eflags = REG_NOTBOL;
 		}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 64ac4f04ee..bd439563d6 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -322,11 +322,11 @@ do
 		${HC}file:1:5:mmap
 		${HC}file:2:5:mmap
 		${HC}file:3:5:mmap
-		${HC}file:3:13:mmap
+		${HC}file:3:14:mmap
 		${HC}file:4:5:mmap
-		${HC}file:4:13:mmap
+		${HC}file:4:14:mmap
 		${HC}file:5:5:mmap
-		${HC}file:5:13:mmap
+		${HC}file:5:14:mmap
 		EOF
 		git grep --column -n -o -e mmap $H >actual &&
 		test_cmp expected actual


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: git grep bug with --column and --only-matching
  2026-04-21 20:33 ` René Scharfe
@ 2026-04-23  9:44   ` Phillip Wood
  0 siblings, 0 replies; 5+ messages in thread
From: Phillip Wood @ 2026-04-23  9:44 UTC (permalink / raw)
  To: René Scharfe, Brandon Chinn
  Cc: git, Taylor Blau, Jeff King, Junio C Hamano

On 21/04/2026 21:33, René Scharfe wrote:
> On 4/21/26 7:03 AM, Brandon Chinn wrote:
>> I'm encountering a bug when using `git grep` with both `--column` and
>> `--only-matching`, is this a known limitation?
>>
>> Repro:
>>
>> ```
>> $ echo 'x   x   x' > repro.txt
>>
>> $ grep -bo x repro.txt
>> 0:x
>> 4:x
>> 8:x
>>
>> $ git grep --no-index -o -n --column x repro.txt
>> repro.txt:1:  1:x
>> repro.txt:1:  2:x
>> repro.txt:1:  6:x
> 
> The first column value matches (1-based for git grep, 0-based for
> grep(1)), the remaining ones are different.  grep(1) shows the offset
> like for the first match, but git grep adds the relative position of the
> end of the previous match.

I'm having a hard time understanding what git is doing. Looking at your 
example with the ruler

>    (`man gitcvs-migration` or `git help cvs-migration` if git is
>          ^^^                   ^^^                        ^^^
>             1111111111222222222233333333334444444444555555555566
>    123456789 123456789 123456789 123456789 123456789 123456789 1
>             
>    7:git
>    16:git
>    38:git

The first match ends at column 9 and the second match starts at column 
29 so how do we end up with 16 as the "column" of the second match when 
the difference between them is 20?

> I don't know how to use the resulting numbers and I agree that it seems
> like a bug -- showing the column of each match within its line makes
> more sense for an option named --column.

Indeed

> However, the original submission of this feature in
> https://lore.kernel.org/git/cover.1529961706.git.me@ttaylorr.com/
> gave similar examples in its commit message and its tests, called
> them "as one would expect" and was not challenged on that, so I may be
> missing something.

I wonder how hard anyone looked at the examples and tests, the 
discussion seemed to have focused on other things.

Thanks

Phillip

> Here's its first example with underlined matches, a ruler and the
> intended output (without unnecessary headers):
> 
>    (`man gitcvs-migration` or `git help cvs-migration` if git is
>          ^^^                   ^^^                        ^^^
>             1111111111222222222233333333334444444444555555555566
>    123456789 123456789 123456789 123456789 123456789 123456789 1
>             
>    7:git
>    16:git
>    38:git
> 
> And here's the last line of the test file from t7810 with underlined
> matches (the other four lines are very similar), a ruler and the
> expected output (unncessary headers removed):
> 
>    foo_mmap bar mmap baz
>        ^^^^     ^^^^
>             111111111122
>    123456789 123456789 1
> 
>    5:mmap
>    13:mmap
> 
> If we wanted to show the column of matches 2 and beyond then we could do
> something like this:
> 
> 
> diff --git a/grep.c b/grep.c
> index c7e1dc1e0e..a54e5d86a9 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -1267,6 +1267,7 @@ static void show_line(struct grep_opt *opt,
>   		regmatch_t match;
>   		enum grep_context ctx = GREP_CONTEXT_BODY;
>   		int eflags = 0;
> +		const char *start = bol;
>   
>   		if (want_color(opt->color)) {
>   			if (sign == ':')
> @@ -1285,6 +1286,7 @@ static void show_line(struct grep_opt *opt,
>   			if (match.rm_so == match.rm_eo)
>   				break;
>   
> +			cno = bol - start + match.rm_so + 1;
>   			if (opt->only_matching)
>   				show_line_header(opt, name, lno, cno, sign);
>   			else
> @@ -1294,7 +1296,6 @@ static void show_line(struct grep_opt *opt,
>   			if (opt->only_matching)
>   				opt->output(opt, "\n", 1);
>   			bol += match.rm_eo;
> -			cno += match.rm_eo;
>   			rest -= match.rm_eo;
>   			eflags = REG_NOTBOL;
>   		}
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index 64ac4f04ee..bd439563d6 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -322,11 +322,11 @@ do
>   		${HC}file:1:5:mmap
>   		${HC}file:2:5:mmap
>   		${HC}file:3:5:mmap
> -		${HC}file:3:13:mmap
> +		${HC}file:3:14:mmap
>   		${HC}file:4:5:mmap
> -		${HC}file:4:13:mmap
> +		${HC}file:4:14:mmap
>   		${HC}file:5:5:mmap
> -		${HC}file:5:13:mmap
> +		${HC}file:5:14:mmap
>   		EOF
>   		git grep --column -n -o -e mmap $H >actual &&
>   		test_cmp expected actual
> 
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] grep: fix --column --only-match for 2nd and later matches
  2026-04-21  5:03 git grep bug with --column and --only-matching Brandon Chinn
  2026-04-21 20:33 ` René Scharfe
@ 2026-04-24 21:04 ` René Scharfe
  2026-05-04 13:10   ` Phillip Wood
  1 sibling, 1 reply; 5+ messages in thread
From: René Scharfe @ 2026-04-24 21:04 UTC (permalink / raw)
  To: Brandon Chinn, git

"git grep --column --only-match" shows the 1-based column number of the
first match on each line, but confusing numbers for further matches.
Example:

   $ echo 123456789012345678901234567890 >file
   $ for d in 1 2 3 4 5 6 7 8 9 0
     do
       git grep --no-index --column --only-matching $d file |
       awk -v FS=: -v l=$d: '{l = l sprintf("%3s", $2)} END {print l}'
     done
   1:  1  2 12
   2:  2  4 14
   3:  3  6 16
   4:  4  8 18
   5:  5 10 20
   6:  6 12 22
   7:  7 14 24
   8:  8 16 26
   9:  9 18 28
   0: 10 20 30

Report the column number of each match instead:

   $ for d in 1 2 3 4 5 6 7 8 9 0
     do
       ./git grep --no-index --column --only-matching $d file |
       awk -v FS=: -v l=$d: '{l = l sprintf("%3s", $2)} END {print l}'
     done
   1:  1 11 21
   2:  2 12 22
   3:  3 13 23
   4:  4 14 24
   5:  5 15 25
   6:  6 16 26
   7:  7 17 27
   8:  8 18 28
   9:  9 19 29
   0: 10 20 30

We need to adjust the test in t7810 as well.  The file it uses has the
following five lines; I add a line highlighting the matches and a ruler
at the bottom here, to make it easier to see that the second "mmap"
indeed starts at column 14:

foo mmap bar
foo_mmap bar
foo_mmap bar mmap
foo mmap bar_mmap
foo_mmap bar mmap baz
    ====     ====
123456789 123456789 1

Reported-by: Brandon Chinn <brandonchinn178@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 grep.c          | 3 ++-
 t/t7810-grep.sh | 6 +++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/grep.c b/grep.c
index c7e1dc1e0e..a54e5d86a9 100644
--- a/grep.c
+++ b/grep.c
@@ -1267,6 +1267,7 @@ static void show_line(struct grep_opt *opt,
 		regmatch_t match;
 		enum grep_context ctx = GREP_CONTEXT_BODY;
 		int eflags = 0;
+		const char *start = bol;
 
 		if (want_color(opt->color)) {
 			if (sign == ':')
@@ -1285,6 +1286,7 @@ static void show_line(struct grep_opt *opt,
 			if (match.rm_so == match.rm_eo)
 				break;
 
+			cno = bol - start + match.rm_so + 1;
 			if (opt->only_matching)
 				show_line_header(opt, name, lno, cno, sign);
 			else
@@ -1294,7 +1296,6 @@ static void show_line(struct grep_opt *opt,
 			if (opt->only_matching)
 				opt->output(opt, "\n", 1);
 			bol += match.rm_eo;
-			cno += match.rm_eo;
 			rest -= match.rm_eo;
 			eflags = REG_NOTBOL;
 		}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 64ac4f04ee..bd439563d6 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -322,11 +322,11 @@ do
 		${HC}file:1:5:mmap
 		${HC}file:2:5:mmap
 		${HC}file:3:5:mmap
-		${HC}file:3:13:mmap
+		${HC}file:3:14:mmap
 		${HC}file:4:5:mmap
-		${HC}file:4:13:mmap
+		${HC}file:4:14:mmap
 		${HC}file:5:5:mmap
-		${HC}file:5:13:mmap
+		${HC}file:5:14:mmap
 		EOF
 		git grep --column -n -o -e mmap $H >actual &&
 		test_cmp expected actual
-- 
2.54.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] grep: fix --column --only-match for 2nd and later matches
  2026-04-24 21:04 ` [PATCH] grep: fix --column --only-match for 2nd and later matches René Scharfe
@ 2026-05-04 13:10   ` Phillip Wood
  0 siblings, 0 replies; 5+ messages in thread
From: Phillip Wood @ 2026-05-04 13:10 UTC (permalink / raw)
  To: René Scharfe, Brandon Chinn, git

Hi René

On 24/04/2026 22:04, René Scharfe wrote:
> diff --git a/grep.c b/grep.c
> index c7e1dc1e0e..a54e5d86a9 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -1267,6 +1267,7 @@ static void show_line(struct grep_opt *opt,
>   		regmatch_t match;
>   		enum grep_context ctx = GREP_CONTEXT_BODY;
>   		int eflags = 0;
> +		const char *start = bol;

Here we save a pointer to the start of the line

>   		if (want_color(opt->color)) {
>   			if (sign == ':')
> @@ -1285,6 +1286,7 @@ static void show_line(struct grep_opt *opt,
>   			if (match.rm_so == match.rm_eo)
>   				break;
>   
> +			cno = bol - start + match.rm_so + 1;

and then we calculate the column number relative to that.

That looks good, thanks for fixing it

Phillip

>   			if (opt->only_matching)
>   				show_line_header(opt, name, lno, cno, sign);
>   			else
> @@ -1294,7 +1296,6 @@ static void show_line(struct grep_opt *opt,
>   			if (opt->only_matching)
>   				opt->output(opt, "\n", 1);
>   			bol += match.rm_eo;
> -			cno += match.rm_eo;
>   			rest -= match.rm_eo;
>   			eflags = REG_NOTBOL;
>   		}
> diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
> index 64ac4f04ee..bd439563d6 100755
> --- a/t/t7810-grep.sh
> +++ b/t/t7810-grep.sh
> @@ -322,11 +322,11 @@ do
>   		${HC}file:1:5:mmap
>   		${HC}file:2:5:mmap
>   		${HC}file:3:5:mmap
> -		${HC}file:3:13:mmap
> +		${HC}file:3:14:mmap
>   		${HC}file:4:5:mmap
> -		${HC}file:4:13:mmap
> +		${HC}file:4:14:mmap
>   		${HC}file:5:5:mmap
> -		${HC}file:5:13:mmap
> +		${HC}file:5:14:mmap
>   		EOF
>   		git grep --column -n -o -e mmap $H >actual &&
>   		test_cmp expected actual


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-04 13:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21  5:03 git grep bug with --column and --only-matching Brandon Chinn
2026-04-21 20:33 ` René Scharfe
2026-04-23  9:44   ` Phillip Wood
2026-04-24 21:04 ` [PATCH] grep: fix --column --only-match for 2nd and later matches René Scharfe
2026-05-04 13:10   ` Phillip Wood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox