From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Elijah Newren <newren@gmail.com>, Jeff King <peff@peff.net>,
Junio C Hamano <gitster@pobox.com>,
"brian m. carlson" <sandals@crustytoothpaste.net>
Subject: [PATCH 3/6] hash.h: introduce `unsafe_hash_algo()`
Date: Wed, 20 Nov 2024 14:13:50 -0500 [thread overview]
Message-ID: <17f92dba34bee235177c8100daab49068fe37254.1732130001.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1732130001.git.me@ttaylorr.com>
In 253ed9ecff (hash.h: scaffolding for _unsafe hashing variants,
2024-09-26), we introduced "unsafe" variants of the SHA-1 hashing
functions by introducing new functions like "unsafe_init_fn()" and so
on.
This approach has a major shortcoming that callers must remember to
consistently use one variant or the other. Failing to consistently use
(or not use) the unsafe variants can lead to crashes at best, or subtle
memory corruption issues at worst.
In the hashfile API, this isn't difficult to achieve, but verifying that
all callers consistently use the unsafe variants is somewhat of a chore
given how spread out all of the callers are. In the sha1 and sha1-unsafe
test helpers, all of the calls to various hash functions are guarded by
an "if (unsafe)" conditional, which is repetitive and cumbersome.
Address these issues by introducing a new pattern whereby one
'git_hash_algo' can return a pointer to another 'git_hash_algo' that
represents the unsafe version of itself. So instead of having something
like:
if (unsafe)
the_hash_algo->init_fn(...);
the_hash_algo->update_fn(...);
the_hash_algo->final_fn(...);
else
the_hash_algo->unsafe_init_fn(...);
the_hash_algo->unsafe_update_fn(...);
the_hash_algo->unsafe_final_fn(...);
we can instead write:
struct git_hash_algo *algop = the_hash_algo;
if (unsafe)
algop = unsafe_hash_algo(algop);
the_hash_algo->init_fn(...);
the_hash_algo->update_fn(...);
the_hash_algo->final_fn(...);
This removes the existing shortcoming by no longer forcing the caller to
"remember" which variant of the hash functions it wants to call, only to
hold onto a 'struct git_hash_algo' pointer that is initialized once.
Similarly, while there currently is still a way to "mix" safe and unsafe
functions, this too will go away after subsequent commits remove all
direct calls to the unsafe_ variants.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
hash.h | 5 +++++
object-file.c | 26 ++++++++++++++++++++++++++
2 files changed, 31 insertions(+)
diff --git a/hash.h b/hash.h
index 756166ce5e8..23cf6876e50 100644
--- a/hash.h
+++ b/hash.h
@@ -305,6 +305,9 @@ struct git_hash_algo {
/* The all-zeros OID. */
const struct object_id *null_oid;
+
+ /* The unsafe variant of this hash function, if one exists. */
+ const struct git_hash_algo *unsafe;
};
extern const struct git_hash_algo hash_algos[GIT_HASH_NALGOS];
@@ -323,6 +326,8 @@ static inline int hash_algo_by_ptr(const struct git_hash_algo *p)
return p - hash_algos;
}
+const struct git_hash_algo *unsafe_hash_algo(const struct git_hash_algo *algop);
+
const struct object_id *null_oid(void);
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2, const struct git_hash_algo *algop)
diff --git a/object-file.c b/object-file.c
index b1a3463852c..fddcdbe9ba6 100644
--- a/object-file.c
+++ b/object-file.c
@@ -204,6 +204,22 @@ static void git_hash_unknown_final_oid(struct object_id *oid UNUSED,
BUG("trying to finalize unknown hash");
}
+static const struct git_hash_algo sha1_unsafe_algo = {
+ .name = "sha1",
+ .format_id = GIT_SHA1_FORMAT_ID,
+ .rawsz = GIT_SHA1_RAWSZ,
+ .hexsz = GIT_SHA1_HEXSZ,
+ .blksz = GIT_SHA1_BLKSZ,
+ .init_fn = git_hash_sha1_init_unsafe,
+ .clone_fn = git_hash_sha1_clone_unsafe,
+ .update_fn = git_hash_sha1_update_unsafe,
+ .final_fn = git_hash_sha1_final_unsafe,
+ .final_oid_fn = git_hash_sha1_final_oid_unsafe,
+ .empty_tree = &empty_tree_oid,
+ .empty_blob = &empty_blob_oid,
+ .null_oid = &null_oid_sha1,
+};
+
const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
{
.name = NULL,
@@ -241,6 +257,7 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
.unsafe_update_fn = git_hash_sha1_update_unsafe,
.unsafe_final_fn = git_hash_sha1_final_unsafe,
.unsafe_final_oid_fn = git_hash_sha1_final_oid_unsafe,
+ .unsafe = &sha1_unsafe_algo,
.empty_tree = &empty_tree_oid,
.empty_blob = &empty_blob_oid,
.null_oid = &null_oid_sha1,
@@ -307,6 +324,15 @@ int hash_algo_by_length(int len)
return GIT_HASH_UNKNOWN;
}
+const struct git_hash_algo *unsafe_hash_algo(const struct git_hash_algo *algop)
+{
+ /* If we have a faster "unsafe" implementation, use that. */
+ if (algop->unsafe)
+ return algop->unsafe;
+ /* Otherwise use the default one. */
+ return algop;
+}
+
/*
* This is meant to hold a *small* number of objects that you would
* want repo_read_object_file() to be able to return, but yet you do not want
--
2.47.0.237.gc601277f4c4
next prev parent reply other threads:[~2024-11-20 19:13 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-20 19:13 [PATCH 0/6] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Taylor Blau
2024-11-20 19:13 ` [PATCH 1/6] csum-file: store the hash algorithm as a struct field Taylor Blau
2024-11-21 9:18 ` Jeff King
2024-11-20 19:13 ` [PATCH 2/6] csum-file.c: extract algop from hashfile_checksum_valid() Taylor Blau
2024-11-20 19:13 ` Taylor Blau [this message]
2024-11-21 9:37 ` [PATCH 3/6] hash.h: introduce `unsafe_hash_algo()` Jeff King
2024-11-22 0:39 ` brian m. carlson
2024-11-22 8:25 ` Jeff King
2024-11-22 20:37 ` brian m. carlson
2025-01-10 21:38 ` Taylor Blau
2025-01-11 2:45 ` Jeff King
2024-11-20 19:13 ` [PATCH 4/6] csum-file.c: use unsafe_hash_algo() Taylor Blau
2024-11-20 19:13 ` [PATCH 5/6] t/helper/test-hash.c: " Taylor Blau
2024-11-20 19:13 ` [PATCH 6/6] hash.h: drop unsafe_ function variants Taylor Blau
2024-11-21 9:41 ` Jeff King
2025-01-08 19:14 ` [PATCH v2 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Taylor Blau
2025-01-08 19:14 ` [PATCH v2 1/8] t/helper/test-tool: implement sha1-unsafe helper Taylor Blau
2025-01-08 19:14 ` [PATCH v2 2/8] csum-file: store the hash algorithm as a struct field Taylor Blau
2025-01-16 11:48 ` Patrick Steinhardt
2025-01-17 21:17 ` Taylor Blau
2025-01-08 19:14 ` [PATCH v2 3/8] csum-file.c: extract algop from hashfile_checksum_valid() Taylor Blau
2025-01-08 19:14 ` [PATCH v2 4/8] hash.h: introduce `unsafe_hash_algo()` Taylor Blau
2025-01-16 11:49 ` Patrick Steinhardt
2025-01-17 21:18 ` Taylor Blau
2025-01-08 19:14 ` [PATCH v2 5/8] csum-file.c: use unsafe_hash_algo() Taylor Blau
2025-01-08 19:14 ` [PATCH v2 6/8] t/helper/test-hash.c: " Taylor Blau
2025-01-08 19:14 ` [PATCH v2 7/8] csum-file: introduce hashfile_checkpoint_init() Taylor Blau
2025-01-10 10:37 ` Jeff King
2025-01-10 21:50 ` Taylor Blau
2025-01-17 21:30 ` Taylor Blau
2025-01-18 12:15 ` Jeff King
2025-01-08 19:14 ` [PATCH v2 8/8] hash.h: drop unsafe_ function variants Taylor Blau
2025-01-10 10:41 ` [PATCH v2 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Jeff King
2025-01-10 21:29 ` Taylor Blau
2025-01-11 2:42 ` Jeff King
2025-01-11 0:14 ` Junio C Hamano
2025-01-11 17:14 ` Taylor Blau
2025-01-17 22:03 ` [PATCH v3 " Taylor Blau
2025-01-17 22:03 ` [PATCH v3 1/8] t/helper/test-tool: implement sha1-unsafe helper Taylor Blau
2025-01-17 22:03 ` [PATCH v3 2/8] csum-file: store the hash algorithm as a struct field Taylor Blau
2025-01-17 22:03 ` [PATCH v3 3/8] csum-file.c: extract algop from hashfile_checksum_valid() Taylor Blau
2025-01-17 22:03 ` [PATCH v3 4/8] hash.h: introduce `unsafe_hash_algo()` Taylor Blau
2025-01-17 22:03 ` [PATCH v3 5/8] csum-file.c: use unsafe_hash_algo() Taylor Blau
2025-01-17 22:03 ` [PATCH v3 6/8] t/helper/test-hash.c: " Taylor Blau
2025-01-17 22:03 ` [PATCH v3 7/8] csum-file: introduce hashfile_checkpoint_init() Taylor Blau
2025-01-17 22:03 ` [PATCH v3 8/8] hash.h: drop unsafe_ function variants Taylor Blau
2025-01-18 12:28 ` [PATCH v3 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Jeff King
2025-01-18 12:43 ` Jeff King
2025-01-22 21:31 ` Junio C Hamano
2025-01-23 17:34 ` [PATCH v4 " Taylor Blau
2025-01-23 17:34 ` [PATCH v4 1/8] t/helper/test-tool: implement sha1-unsafe helper Taylor Blau
2025-01-23 17:34 ` [PATCH v4 2/8] csum-file: store the hash algorithm as a struct field Taylor Blau
2025-01-23 17:34 ` [PATCH v4 3/8] csum-file.c: extract algop from hashfile_checksum_valid() Taylor Blau
2025-01-23 17:34 ` [PATCH v4 4/8] hash.h: introduce `unsafe_hash_algo()` Taylor Blau
2025-01-23 17:34 ` [PATCH v4 5/8] csum-file.c: use unsafe_hash_algo() Taylor Blau
2025-01-23 17:34 ` [PATCH v4 6/8] t/helper/test-hash.c: " Taylor Blau
2025-01-23 17:34 ` [PATCH v4 7/8] csum-file: introduce hashfile_checkpoint_init() Taylor Blau
2025-01-23 17:34 ` [PATCH v4 8/8] hash.h: drop unsafe_ function variants Taylor Blau
2025-01-23 18:30 ` [PATCH v4 0/8] hash: introduce unsafe_hash_algo(), drop unsafe_ variants Junio C Hamano
2025-01-23 18:50 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17f92dba34bee235177c8100daab49068fe37254.1732130001.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).