From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: <git@vger.kernel.org>
Cc: Junio C Hamano <gitster@pobox.com>,
Patrick Steinhardt <ps@pks.im>,
Ezekiel Newren <ezekielnewren@gmail.com>
Subject: [PATCH v2 05/15] rust: add a hash algorithm abstraction
Date: Mon, 17 Nov 2025 22:16:11 +0000 [thread overview]
Message-ID: <20251117221621.2863243-6-sandals@crustytoothpaste.net> (raw)
In-Reply-To: <20251117221621.2863243-1-sandals@crustytoothpaste.net>
This works very similarly to the existing one in C except that it
doesn't provide any functionality to hash an object. We don't currently
need that right now, but the use of those function pointers do make it
substantially more difficult to write a bit-for-bit identical structure
across the C/Rust interface, so omit them for now.
Instead of the more customary "&self", use "self", because the former is
the size of a pointer and the latter is the size of an integer on most
systems. Don't define an unknown value but use an Option for that
instead.
Update the object ID structure to allow slicing the data appropriately
for the algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
src/hash.rs | 159 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 159 insertions(+)
diff --git a/src/hash.rs b/src/hash.rs
index 0219391820..0ec0ab0490 100644
--- a/src/hash.rs
+++ b/src/hash.rs
@@ -10,8 +10,25 @@
// You should have received a copy of the GNU General Public License along
// with this program; if not, see <https://www.gnu.org/licenses/>.
+use std::error::Error;
+use std::fmt::{self, Debug, Display};
+
pub const GIT_MAX_RAWSZ: usize = 32;
+/// An error indicating an invalid hash algorithm.
+///
+/// The contained `u32` is the same as the `algo` field in `ObjectID`.
+#[derive(Debug, Copy, Clone)]
+pub struct InvalidHashAlgorithm(pub u32);
+
+impl Display for InvalidHashAlgorithm {
+ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+ write!(f, "invalid hash algorithm {}", self.0)
+ }
+}
+
+impl Error for InvalidHashAlgorithm {}
+
/// A binary object ID.
#[repr(C)]
#[derive(Debug, Clone, Ord, PartialOrd, Eq, PartialEq)]
@@ -19,3 +36,145 @@ pub struct ObjectID {
pub hash: [u8; GIT_MAX_RAWSZ],
pub algo: u32,
}
+
+#[allow(dead_code)]
+impl ObjectID {
+ pub fn as_slice(&self) -> Result<&[u8], InvalidHashAlgorithm> {
+ match HashAlgorithm::from_u32(self.algo) {
+ Some(algo) => Ok(&self.hash[0..algo.raw_len()]),
+ None => Err(InvalidHashAlgorithm(self.algo)),
+ }
+ }
+
+ pub fn as_mut_slice(&mut self) -> Result<&mut [u8], InvalidHashAlgorithm> {
+ match HashAlgorithm::from_u32(self.algo) {
+ Some(algo) => Ok(&mut self.hash[0..algo.raw_len()]),
+ None => Err(InvalidHashAlgorithm(self.algo)),
+ }
+ }
+}
+
+/// A hash algorithm,
+#[repr(C)]
+#[derive(Debug, Copy, Clone, Ord, PartialOrd, Eq, PartialEq)]
+pub enum HashAlgorithm {
+ SHA1 = 1,
+ SHA256 = 2,
+}
+
+#[allow(dead_code)]
+impl HashAlgorithm {
+ const SHA1_NULL_OID: ObjectID = ObjectID {
+ hash: [0u8; 32],
+ algo: Self::SHA1 as u32,
+ };
+ const SHA256_NULL_OID: ObjectID = ObjectID {
+ hash: [0u8; 32],
+ algo: Self::SHA256 as u32,
+ };
+
+ const SHA1_EMPTY_TREE: ObjectID = ObjectID {
+ hash: *b"\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+ algo: Self::SHA1 as u32,
+ };
+ const SHA256_EMPTY_TREE: ObjectID = ObjectID {
+ hash: *b"\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1\x04\xd4\x5d\x8d\x85\xef\xa9\xb0\x57\xb5\x3b\x14\xb4\xb9\xb9\x39\xdd\x74\xde\xcc\x53\x21",
+ algo: Self::SHA256 as u32,
+ };
+
+ const SHA1_EMPTY_BLOB: ObjectID = ObjectID {
+ hash: *b"\xe6\x9d\xe2\x9b\xb2\xd1\xd6\x43\x4b\x8b\x29\xae\x77\x5a\xd8\xc2\xe4\x8c\x53\x91\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00",
+ algo: Self::SHA1 as u32,
+ };
+ const SHA256_EMPTY_BLOB: ObjectID = ObjectID {
+ hash: *b"\x47\x3a\x0f\x4c\x3b\xe8\xa9\x36\x81\xa2\x67\xe3\xb1\xe9\xa7\xdc\xda\x11\x85\x43\x6f\xe1\x41\xf7\x74\x91\x20\xa3\x03\x72\x18\x13",
+ algo: Self::SHA256 as u32,
+ };
+
+ /// Return a hash algorithm based on the internal integer ID used by Git.
+ ///
+ /// Returns `None` if the algorithm doesn't indicate a valid algorithm.
+ pub const fn from_u32(algo: u32) -> Option<HashAlgorithm> {
+ match algo {
+ 1 => Some(HashAlgorithm::SHA1),
+ 2 => Some(HashAlgorithm::SHA256),
+ _ => None,
+ }
+ }
+
+ /// Return a hash algorithm based on the internal integer ID used by Git.
+ ///
+ /// Returns `None` if the algorithm doesn't indicate a valid algorithm.
+ pub const fn from_format_id(algo: u32) -> Option<HashAlgorithm> {
+ match algo {
+ 0x73686131 => Some(HashAlgorithm::SHA1),
+ 0x73323536 => Some(HashAlgorithm::SHA256),
+ _ => None,
+ }
+ }
+
+ /// The name of this hash algorithm as a string suitable for the configuration file.
+ pub const fn name(self) -> &'static str {
+ match self {
+ HashAlgorithm::SHA1 => "sha1",
+ HashAlgorithm::SHA256 => "sha256",
+ }
+ }
+
+ /// The format ID of this algorithm for binary formats.
+ ///
+ /// Note that when writing this to a data format, it should be written in big-endian format
+ /// explicitly.
+ pub const fn format_id(self) -> u32 {
+ match self {
+ HashAlgorithm::SHA1 => 0x73686131,
+ HashAlgorithm::SHA256 => 0x73323536,
+ }
+ }
+
+ /// The length of binary object IDs in this algorithm in bytes.
+ pub const fn raw_len(self) -> usize {
+ match self {
+ HashAlgorithm::SHA1 => 20,
+ HashAlgorithm::SHA256 => 32,
+ }
+ }
+
+ /// The length of object IDs in this algorithm in hexadecimal characters.
+ pub const fn hex_len(self) -> usize {
+ self.raw_len() * 2
+ }
+
+ /// The number of bytes which is processed by one iteration of this algorithm's compression
+ /// function.
+ pub const fn block_size(self) -> usize {
+ match self {
+ HashAlgorithm::SHA1 => 64,
+ HashAlgorithm::SHA256 => 64,
+ }
+ }
+
+ /// The object ID representing the empty blob.
+ pub const fn empty_blob(self) -> &'static ObjectID {
+ match self {
+ HashAlgorithm::SHA1 => &Self::SHA1_EMPTY_BLOB,
+ HashAlgorithm::SHA256 => &Self::SHA256_EMPTY_BLOB,
+ }
+ }
+
+ /// The object ID representing the empty tree.
+ pub const fn empty_tree(self) -> &'static ObjectID {
+ match self {
+ HashAlgorithm::SHA1 => &Self::SHA1_EMPTY_TREE,
+ HashAlgorithm::SHA256 => &Self::SHA256_EMPTY_TREE,
+ }
+ }
+
+ /// The object ID which is all zeros.
+ pub const fn null_oid(self) -> &'static ObjectID {
+ match self {
+ HashAlgorithm::SHA1 => &Self::SHA1_NULL_OID,
+ HashAlgorithm::SHA256 => &Self::SHA256_NULL_OID,
+ }
+ }
+}
next prev parent reply other threads:[~2025-11-17 22:16 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-27 0:43 [PATCH 00/14] SHA-1/SHA-256 interoperability, part 2 brian m. carlson
2025-10-27 0:43 ` [PATCH 01/14] repository: require Rust support for interoperability brian m. carlson
2025-10-28 9:16 ` Patrick Steinhardt
2025-10-27 0:43 ` [PATCH 02/14] conversion: don't crash when no destination algo brian m. carlson
2025-10-27 0:43 ` [PATCH 03/14] hash: use uint32_t for object_id algorithm brian m. carlson
2025-10-28 9:16 ` Patrick Steinhardt
2025-10-28 18:28 ` Ezekiel Newren
2025-10-28 19:33 ` Junio C Hamano
2025-10-28 19:58 ` Ezekiel Newren
2025-10-28 20:20 ` Junio C Hamano
2025-10-30 0:23 ` brian m. carlson
2025-10-30 1:58 ` Collin Funk
2025-11-03 1:30 ` brian m. carlson
2025-10-29 0:33 ` brian m. carlson
2025-10-29 9:07 ` Patrick Steinhardt
2025-10-27 0:43 ` [PATCH 04/14] rust: add a ObjectID struct brian m. carlson
2025-10-28 9:17 ` Patrick Steinhardt
2025-10-28 19:07 ` Ezekiel Newren
2025-10-29 0:42 ` brian m. carlson
2025-10-28 19:40 ` Junio C Hamano
2025-10-29 0:47 ` brian m. carlson
2025-10-29 0:36 ` brian m. carlson
2025-10-29 9:08 ` Patrick Steinhardt
2025-10-30 0:32 ` brian m. carlson
2025-10-27 0:43 ` [PATCH 05/14] rust: add a hash algorithm abstraction brian m. carlson
2025-10-28 9:18 ` Patrick Steinhardt
2025-10-28 17:09 ` Ezekiel Newren
2025-10-28 20:00 ` Junio C Hamano
2025-10-28 20:03 ` Ezekiel Newren
2025-10-29 13:27 ` Junio C Hamano
2025-10-29 14:32 ` Junio C Hamano
2025-10-27 0:43 ` [PATCH 06/14] hash: add a function to look up hash algo structs brian m. carlson
2025-10-28 9:18 ` Patrick Steinhardt
2025-10-28 20:12 ` Junio C Hamano
2025-11-04 1:48 ` brian m. carlson
2025-11-04 10:24 ` Junio C Hamano
2025-10-27 0:43 ` [PATCH 07/14] csum-file: define hashwrite's count as a uint32_t brian m. carlson
2025-10-28 17:22 ` Ezekiel Newren
2025-10-27 0:43 ` [PATCH 08/14] write-or-die: add an fsync component for the loose object map brian m. carlson
2025-10-27 0:43 ` [PATCH 09/14] hash: expose hash context functions to Rust brian m. carlson
2025-10-29 16:32 ` Junio C Hamano
2025-10-30 21:42 ` brian m. carlson
2025-10-30 21:52 ` Junio C Hamano
2025-10-27 0:44 ` [PATCH 10/14] rust: add a build.rs script for tests brian m. carlson
2025-10-28 9:18 ` Patrick Steinhardt
2025-10-28 17:42 ` Ezekiel Newren
2025-10-29 16:43 ` Junio C Hamano
2025-10-29 22:10 ` Ezekiel Newren
2025-10-29 23:12 ` Junio C Hamano
2025-10-30 6:26 ` Patrick Steinhardt
2025-10-30 13:54 ` Junio C Hamano
2025-10-31 22:43 ` Ezekiel Newren
2025-11-01 11:18 ` Junio C Hamano
2025-10-27 0:44 ` [PATCH 11/14] rust: add functionality to hash an object brian m. carlson
2025-10-28 9:18 ` Patrick Steinhardt
2025-10-29 0:53 ` brian m. carlson
2025-10-29 9:07 ` Patrick Steinhardt
2025-10-28 18:05 ` Ezekiel Newren
2025-10-29 1:05 ` brian m. carlson
2025-10-29 16:02 ` Ben Knoble
2025-10-27 0:44 ` [PATCH 12/14] rust: add a new binary loose object map format brian m. carlson
2025-10-28 9:18 ` Patrick Steinhardt
2025-10-29 1:37 ` brian m. carlson
2025-10-29 9:07 ` Patrick Steinhardt
2025-10-29 17:03 ` Junio C Hamano
2025-10-29 18:21 ` Junio C Hamano
2025-10-27 0:44 ` [PATCH 13/14] rust: add a small wrapper around the hashfile code brian m. carlson
2025-10-28 18:19 ` Ezekiel Newren
2025-10-29 1:39 ` brian m. carlson
2025-10-27 0:44 ` [PATCH 14/14] object-file-convert: always make sure object ID algo is valid brian m. carlson
2025-10-29 20:07 ` [PATCH 00/14] SHA-1/SHA-256 interoperability, part 2 Junio C Hamano
2025-10-29 20:15 ` Junio C Hamano
2025-11-11 0:12 ` Ezekiel Newren
2025-11-14 17:25 ` Junio C Hamano
2025-11-14 21:11 ` Junio C Hamano
2025-11-17 6:56 ` Junio C Hamano
2025-11-17 22:09 ` brian m. carlson
2025-11-18 0:13 ` Junio C Hamano
2025-11-19 23:04 ` brian m. carlson
2025-11-19 23:24 ` Junio C Hamano
2025-11-19 23:37 ` Ezekiel Newren
2025-11-20 19:52 ` Ezekiel Newren
2025-11-20 23:02 ` brian m. carlson
2025-11-20 23:11 ` Ezekiel Newren
2025-11-20 23:14 ` Junio C Hamano
2025-11-17 22:16 ` [PATCH v2 00/15] " brian m. carlson
2025-11-17 22:16 ` [PATCH v2 01/15] repository: require Rust support for interoperability brian m. carlson
2025-11-17 22:16 ` [PATCH v2 02/15] conversion: don't crash when no destination algo brian m. carlson
2025-11-17 22:16 ` [PATCH v2 03/15] hash: use uint32_t for object_id algorithm brian m. carlson
2025-11-17 22:16 ` [PATCH v2 04/15] rust: add a ObjectID struct brian m. carlson
2025-11-17 22:16 ` brian m. carlson [this message]
2025-11-17 22:16 ` [PATCH v2 06/15] hash: add a function to look up hash algo structs brian m. carlson
2025-11-17 22:16 ` [PATCH v2 07/15] rust: add additional helpers for ObjectID brian m. carlson
2025-11-17 22:16 ` [PATCH v2 08/15] csum-file: define hashwrite's count as a uint32_t brian m. carlson
2025-11-17 22:16 ` [PATCH v2 09/15] write-or-die: add an fsync component for the object map brian m. carlson
2025-11-17 22:16 ` [PATCH v2 10/15] hash: expose hash context functions to Rust brian m. carlson
2025-11-17 22:16 ` [PATCH v2 11/15] rust: add a build.rs script for tests brian m. carlson
2025-11-17 22:16 ` [PATCH v2 12/15] rust: add functionality to hash an object brian m. carlson
2025-11-17 22:16 ` [PATCH v2 13/15] rust: add a new binary object map format brian m. carlson
2025-11-17 22:16 ` [PATCH v2 14/15] rust: add a small wrapper around the hashfile code brian m. carlson
2025-11-17 22:16 ` [PATCH v2 15/15] object-file-convert: always make sure object ID algo is valid brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251117221621.2863243-6-sandals@crustytoothpaste.net \
--to=sandals@crustytoothpaste.net \
--cc=ezekielnewren@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).