public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 1/3] show-index: implement automatic hash detection
Date: Tue, 20 Jan 2026 10:07:42 -0800	[thread overview]
Message-ID: <xmqqzf68yx75.fsf@gitster.g> (raw)
In-Reply-To: <20260120140901.517928-2-shreyanshpaliwalcmsmn@gmail.com> (Shreyansh Paliwal's message of "Tue, 20 Jan 2026 19:35:39 +0530")

Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> writes:

> @@ -71,6 +60,40 @@ int cmd_show_index(int argc,
>  			die("corrupt index file");
>  		nr = n;
>  	}
> +
> +	/* detection of hash algorithm
> +	Only works for small files, i.e without large offsets */
> +	if(!the_hash_algo && version == 2) {

We have one SP between "if" (and other syntactic elements like
"while") and the open parenthesis "(".  End-user controlled function
names lack this SP between <word> and "(".

If we turn what is inide of this block into a separate helper
function, it would allow us to structure the logic better.

	/* Returns GIT_HASH_* constants, or GIT_HASH_UNKNOWN */
	static int auto_detect_hash_function(int fd)

For example, ...

> +		struct stat st;
> +		size_t file_base_size;
> +		size_t table_size;
> +		size_t size_rem;
> +		size_t hash_size;
> +
> +		if(fstat(0, &st) || !S_ISREG(st.st_mode))
> +			die(_("unable to detect hash from non-regular file"));

... this "die()" does not have to be here.  We can just return
GIT_HASH_UNKNOWN and let the caller fallback.  Does the existing
code correctly complain when the filestream is opened for a
non-regular file, or it just gets totally confused?

> +		file_base_size = 8 + (256 * 4);
> +		table_size = file_base_size + (nr * 4 * 4);
> +		size_rem = st.st_size - table_size;
> +		hash_size = size_rem / (nr + 2);
> +
> +		if(hash_size == GIT_SHA1_RAWSZ) {
> +			repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> +		} else if(hash_size == GIT_SHA256_RAWSZ) {
> +			repo_set_hash_algo(the_repository, GIT_HASH_SHA256);

And instead of calling repo_set_hash_algo(), just return the
constants so that the caller can handle it.  And 

> +		} else {
> +			die(_("unable to detect hash algorithm, "
> +					"use --object-format option"));

... this also can return GIT_HASH_UNKNOWN, without complaining
anything.

> +		}
> +	}

So, instead of inserting all of the above lines in cmd_show_index(),
we'd have something like the following ...

	hash_func = auto_detect_hash_function(0);
	if (hash_func == GIT_HASH_UNKNOWN) {
		warning(_("assuming SHA-1; use --object-format to override"));
		hash_func = GIT_HASH_SHA1;
	}
	repo_set_hash_algo(the_repository, hash_func);
        hashsz = the_hash_algo->rawsz;

... there.

By the way, what happens if we find SHA-256 also broken and end up
choosing another hash function that is 256-bit wide in the next hash
revamp?

Thanks.

> +
> +	/* Final fallback to SHA1 */
> +	if(!the_hash_algo)
> +		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> +
> +	hashsz = the_hash_algo->rawsz;
> +
>  	if (version == 1) {
>  		for (i = 0; i < nr; i++) {
>  			unsigned int offset, entry[(GIT_MAX_RAWSZ + 4) / sizeof(unsigned int)];

  reply	other threads:[~2026-01-20 18:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-20 14:05 [RFC][PATCH 0/3] show-index: modernize and implement auto-detection of hash algorithm Shreyansh Paliwal
2026-01-20 14:05 ` [PATCH 1/3] show-index: implement automatic hash detection Shreyansh Paliwal
2026-01-20 18:07   ` Junio C Hamano [this message]
2026-01-21  8:09     ` Patrick Steinhardt
2026-01-21 10:31       ` Shreyansh Paliwal
2026-01-23  7:22         ` Patrick Steinhardt
2026-01-23 16:08           ` Shreyansh Paliwal
2026-01-23 20:29       ` brian m. carlson
2026-01-21 10:28     ` Shreyansh Paliwal
2026-01-20 14:05 ` [PATCH 2/3] show-index: use gettext wrapping in error messages Shreyansh Paliwal
2026-01-20 14:05 ` [PATCH 3/3] show-index: remove global state variables Shreyansh Paliwal
2026-01-21 10:39   ` Phillip Wood
2026-01-21 12:47     ` Shreyansh Paliwal
2026-01-21 17:23     ` Junio C Hamano
2026-01-29 15:36 ` [PATCH] show-index: warn when falling back to SHA-1 outside a repository Shreyansh Paliwal
2026-01-29 23:03   ` Junio C Hamano
2026-01-30  8:59     ` Shreyansh Paliwal
2026-01-29 23:12   ` brian m. carlson
2026-01-30  9:04     ` Shreyansh Paliwal
2026-01-30 13:40       ` Patrick Steinhardt
2026-01-30 17:01         ` Junio C Hamano
2026-01-30 15:31   ` [PATCH V2 0/2] show-index: add warning and wrap error messages with gettext Shreyansh Paliwal
2026-01-30 15:31     ` [PATCH V2 1/2] show-index: warn when falling back to SHA-1 outside a repository Shreyansh Paliwal
2026-01-30 15:31     ` [PATCH V2 2/2] show-index: use gettext wrapping in user facing error messages Shreyansh Paliwal
2026-01-30 17:07     ` [PATCH V2 0/2] show-index: add warning and wrap error messages with gettext Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqzf68yx75.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=shreyanshpaliwalcmsmn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox