git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: "brian m. carlson" <bk2204@github.com>
Subject: [PATCH] rerere: match the hash algorithm with its length
Date: Fri, 21 Jul 2023 16:36:12 -0700	[thread overview]
Message-ID: <xmqqa5vou9ar.fsf@gitster.g> (raw)

The "conflict ID" used by "git rerere" to identify past conflicts we
saw has been a SHA-1 hash of the normalized text taken from the
conflicted region.  0d7c419a (rerere: convert to use the_hash_algo,
2018-10-15) updated the rerere machinery to use more general "hash"
instead of hardcoded SHA-1 by using the_hash_algo, GIT_MAX_RAWSZ and
their friends, but the code that read from the MERGE_RR records were
left unconverted to still use get_sha1_hex(), possibly breaking the
operation in SHA-256 repositories.

We enumerate the subdirectories of $GIT_DIR/rr-cache/ and use the
ones whose name passes parse_oid_hex() in full as conflict IDs,
so they are always of correct length relative to the choice of the
hash the repository makes, and they are written to the MERGE_RR
file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * The "conflict ID" uses SHA-1 not because we needed a secure hash.
   We only needed something that is reasonably long with fewer
   collisions (the "rerere" machinery tolerates collisions).  We
   just had SHA-1 readily available to us and that was the only
   reason we used it.  As these "conflict ID" are not security
   sensitive, we could leave them as SHA-1 even in SHA-256
   repositories and reverting 0d7c419a might be a good first step if
   we want to go in that direction, but let's be consistent.

 rerere.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/rerere.c b/rerere.c
index 7070f75014..f06172253b 100644
--- a/rerere.c
+++ b/rerere.c
@@ -203,8 +203,13 @@ static void read_rr(struct repository *r, struct string_list *rr)
 		int variant;
 		const unsigned hexsz = the_hash_algo->hexsz;
 
-		/* There has to be the hash, tab, path and then NUL */
-		if (buf.len < hexsz + 2 || get_sha1_hex(buf.buf, hash))
+		/*
+		 * There has to be the "conflict ID", tab, path and then NUL.
+		 * "conflict ID" would be a hash, possibly suffixed by "." and
+		 * a small integer (variant number).
+		 */
+		if (buf.len < hexsz + 2 ||
+		    get_hash_hex_algop(buf.buf, hash, the_hash_algo))
 			die(_("corrupt MERGE_RR"));
 
 		if (buf.buf[hexsz] != '.') {
-- 
2.41.0-394-ge43f4fd0bd


             reply	other threads:[~2023-07-21 23:38 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-21 23:36 Junio C Hamano [this message]
2023-07-23 15:03 ` [PATCH] rerere: match the hash algorithm with its length brian m. carlson
2023-07-23 16:24   ` Junio C Hamano
2023-07-24 21:22     ` brian m. carlson
2023-07-24 23:11       ` Re* " Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqa5vou9ar.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=bk2204@github.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).