git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Minor Bug in git cat-file (git 2.50)?
@ 2025-08-10 14:52 Jon Forrest
  2025-08-11  8:54 ` Patrick Steinhardt
  2025-08-11 15:09 ` Junio C Hamano
  0 siblings, 2 replies; 4+ messages in thread
From: Jon Forrest @ 2025-08-10 14:52 UTC (permalink / raw)
  To: git

(Sorry if you see this more than once)

I'm using 'git cat-file' to show the example. This is probably not a
command-specific problem.

The problem is that using a deliberately ambiguous object ID produces
surprising output. This is a minor issue.

% git --version
git version 2.50.GIT
% uname -a
Linux fedora 6.15.9-201.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Aug  2 
11:37:34 UTC 2025 x86_64 GNU/Linux

% git init

# depending on where you run the test, might not be necessary
% git config --global --add safe.directory /tmp

Initialized empty Git repository in /tmp/.git/
% echo a > a.txt
% git add a.txt
% git ls-files -s
100644 78981922613b2afb6025042ff6bd878ac1994e85 0       a.txt	
% git cat-file -t 78981922613b2afb6025042ff6bd878ac1994e85
blob

# All is well so far.

% pushd .git/objects/78
% ls
981922613b2afb6025042ff6bd878ac1994e85
# create a new file with the same name as the file that already exists,
# except change the final letter to something else.
% cp 981922613b2afb6025042ff6bd878ac1994e85 
981922613b2afb6025042ff6bd878ac1994e86
% ls
981922613b2afb6025042ff6bd878ac1994e85 
981922613b2afb6025042ff6bd878ac1994e86
% popd
# use an ambiguous SHA1 prefix
# why does the next command produce two identical hints, both of which
# are incorrect?
% git cat-file -t 78981922613b2afb6025042ff6bd878ac1994e8
error: short object ID 78981922613b2afb6025042ff6bd878ac1994e8 is 
ambiguous  # this is correct
hint: The candidates are:
hint:   7898192 blob
hint:   7898192 blob
fatal: Not a valid object name 78981922613b2afb6025042ff6bd878ac1994e8
# I would have expected:
hint:   78981922613b2afb6025042ff6bd878ac1994e85 blob
hint:   78981922613b2afb6025042ff6bd878ac1994e86 blob
# using the supplied hint doesn't work, which is no surprise
% git cat-file -t 7898192
fatal: Not a valid object name 7898192

Cordially,
Jon Forrest




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Minor Bug in git cat-file (git 2.50)?
  2025-08-10 14:52 Minor Bug in git cat-file (git 2.50)? Jon Forrest
@ 2025-08-11  8:54 ` Patrick Steinhardt
  2025-08-11 19:10   ` Jon Forrest
  2025-08-11 15:09 ` Junio C Hamano
  1 sibling, 1 reply; 4+ messages in thread
From: Patrick Steinhardt @ 2025-08-11  8:54 UTC (permalink / raw)
  To: Jon Forrest; +Cc: git

On Sun, Aug 10, 2025 at 07:52:42AM -0700, Jon Forrest wrote:
> (Sorry if you see this more than once)
> 
> I'm using 'git cat-file' to show the example. This is probably not a
> command-specific problem.
> 
> The problem is that using a deliberately ambiguous object ID produces
> surprising output. This is a minor issue.
> 
> % git --version
> git version 2.50.GIT
> % uname -a
> Linux fedora 6.15.9-201.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Aug  2
> 11:37:34 UTC 2025 x86_64 GNU/Linux
> 
> % git init
> 
> # depending on where you run the test, might not be necessary
> % git config --global --add safe.directory /tmp
> 
> Initialized empty Git repository in /tmp/.git/
> % echo a > a.txt
> % git add a.txt
> % git ls-files -s
> 100644 78981922613b2afb6025042ff6bd878ac1994e85 0       a.txt	
> % git cat-file -t 78981922613b2afb6025042ff6bd878ac1994e85
> blob
> 
> # All is well so far.
> 
> % pushd .git/objects/78
> % ls
> 981922613b2afb6025042ff6bd878ac1994e85
> # create a new file with the same name as the file that already exists,
> # except change the final letter to something else.
> % cp 981922613b2afb6025042ff6bd878ac1994e85
> 981922613b2afb6025042ff6bd878ac1994e86
> % ls
> 981922613b2afb6025042ff6bd878ac1994e85
> 981922613b2afb6025042ff6bd878ac1994e86
> % popd
> # use an ambiguous SHA1 prefix
> # why does the next command produce two identical hints, both of which
> # are incorrect?
> % git cat-file -t 78981922613b2afb6025042ff6bd878ac1994e8
> error: short object ID 78981922613b2afb6025042ff6bd878ac1994e8 is ambiguous
> # this is correct
> hint: The candidates are:
> hint:   7898192 blob
> hint:   7898192 blob
> fatal: Not a valid object name 78981922613b2afb6025042ff6bd878ac1994e8
> # I would have expected:
> hint:   78981922613b2afb6025042ff6bd878ac1994e85 blob
> hint:   78981922613b2afb6025042ff6bd878ac1994e86 blob
> # using the supplied hint doesn't work, which is no surprise
> % git cat-file -t 7898192
> fatal: Not a valid object name 7898192

Hm. I think the problem here is that you intentfully corrupt the
repository by copying the blob to a different name. As the object
contents itself remain the same though, and as the object ID is computed
by hashing the object, looking up that object would ultimately lead to
the original object ID.

The consequence is that `show_ambiguous_object()` becomes confused. It
_looks_ like the object name is ambiguous, but it ultimately isn't
because both names refer to the same underlying object. We then use
`repo_find_unique_abbrev()` to shorten the printed object IDs that are
printed in the error message, but given that those are really the same
object we abbreviate them to the same shortened object ID.

I'm not really sure that this is something that we need to fix -- the
repository is corrupt, and git-fsck(1) should tell you so. Did you hit
any real world scenario where this has happened in the wild without
intentfully corrupting the repository? Or given that you explicitly
mention Git 2.50, has the behaviour changed recently?

Thanks!

Patrick

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Minor Bug in git cat-file (git 2.50)?
  2025-08-10 14:52 Minor Bug in git cat-file (git 2.50)? Jon Forrest
  2025-08-11  8:54 ` Patrick Steinhardt
@ 2025-08-11 15:09 ` Junio C Hamano
  1 sibling, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2025-08-11 15:09 UTC (permalink / raw)
  To: Jon Forrest; +Cc: git

Jon Forrest <nobozo@gmail.com> writes:

> % ls
> 981922613b2afb6025042ff6bd878ac1994e85
> 981922613b2afb6025042ff6bd878ac1994e86
> % popd
> # use an ambiguous SHA1 prefix
> # why does the next command produce two identical hints, both of which
> # are incorrect?
> % git cat-file -t 78981922613b2afb6025042ff6bd878ac1994e8
> error: short object ID 78981922613b2afb6025042ff6bd878ac1994e8 is
> ambiguous  # this is correct
> hint: The candidates are:
> hint:   7898192 blob
> hint:   7898192 blob
> fatal: Not a valid object name 78981922613b2afb6025042ff6bd878ac1994e8
> # I would have expected:
> hint:   78981922613b2afb6025042ff6bd878ac1994e85 blob
> hint:   78981922613b2afb6025042ff6bd878ac1994e86 blob
> # using the supplied hint doesn't work, which is no surprise
> % git cat-file -t 7898192
> fatal: Not a valid object name 7898192

Fun.

I do not think disambiguation code inspects object validity to
filter out invalid one when computing the shortened object name when
giving hints, so one of these two being a corrupt object should not
have anything to do with this outcome.

Perhaps something like this would help?

 object-name.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git c/object-name.c w/object-name.c
index 11aa0e6afc..13e8a4e47d 100644
--- c/object-name.c
+++ w/object-name.c
@@ -704,7 +704,7 @@ static int extend_abbrev_len(const struct object_id *oid, void *cb_data)
 	while (mad->hex[i] && mad->hex[i] == get_hex_char_from_oid(oid, i))
 		i++;
 
-	if (i < GIT_MAX_RAWSZ && i >= mad->cur_len)
+	if (i < GIT_MAX_HEXSZ && i >= mad->cur_len)
 		mad->cur_len = i + 1;
 
 	return 0;
















^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Minor Bug in git cat-file (git 2.50)?
  2025-08-11  8:54 ` Patrick Steinhardt
@ 2025-08-11 19:10   ` Jon Forrest
  0 siblings, 0 replies; 4+ messages in thread
From: Jon Forrest @ 2025-08-11 19:10 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git



On 8/11/25 1:54 AM, Patrick Steinhardt wrote:

Thanks to you and Junio for looking at this.

I agree that this shouldn't be considered a high priority
bug.

Do you agree that the below should be what I see?

>> # I would have expected:
>> hint:   78981922613b2afb6025042ff6bd878ac1994e85 blob
>> hint:   78981922613b2afb6025042ff6bd878ac1994e86 blob

The reason I'm doing this is because, just for fun, I'm
trying to implement the disambiguation code in Go, and
I needed a test case.

> Hm. I think the problem here is that you intentfully corrupt the
> repository by copying the blob to a different name. 

I didn't intentionally corrupt the repository but I couldn't think
of any other way to do what I needed to do.

How would you have done this?

> I'm not really sure that this is something that we need to fix -- the
> repository is corrupt, and git-fsck(1) should tell you so.

Here's what git-fsck said:

% git fsck
Checking ref database: 100% (1/1), done.
error: ee1a0d672b283dc03c94a266647e505ad340dc29: hash-path mismatch, 
found at: .git/objects/ee/1a0d672b283dc03c94a266647e505ad340dc30
Checking object directories: 100% (256/256), done.
dangling tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
dangling tree d4607c312181a2fdbb66e8accb5b006156b6b733

> Did you hit any real world scenario where this has happened
 > in the wild without intentfully corrupting the repository?

No

 > Or given that you explicitly mention Git 2.50, has the behaviour
 > changed recently?

I mentioned Git 2.50 because I wanted to write a useful bug report.
I have no idea if the behavior has changed.

Thanks for your work.

Jon


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-08-11 19:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-10 14:52 Minor Bug in git cat-file (git 2.50)? Jon Forrest
2025-08-11  8:54 ` Patrick Steinhardt
2025-08-11 19:10   ` Jon Forrest
2025-08-11 15:09 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).