All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Feng Wu via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Feng Wu <wufengwufengwufeng@gmail.com>,
	Feng Wu <wufengwufengwufeng@gmail.com>
Subject: [PATCH] rust: validate object map insert algorithms
Date: Fri, 26 Jun 2026 15:58:26 +0000	[thread overview]
Message-ID: <pull.2350.git.git.1782489506255.gitgitgadget@gmail.com> (raw)

From: Feng Wu <wufengwufengwufeng@gmail.com>

The loose object map stores entries keyed by the repository's storage
hash and the compatible hash.  ObjectMap::insert() accepts its two object
IDs in either order, but it currently checks only whether oid1 uses the
compatible hash algorithm.  If it does not, oid2 is assumed to be the
compatible ID without validating oid2's algorithm.

That means callers can pass two IDs with the same algorithm, or an ID
using an unknown algorithm, and have one of them silently treated as the
storage ID.  This does not match the map invariant that each entry must
contain exactly one storage hash and one compatible hash.

Make the invariant explicit by decoding both object ID algorithms and
rejecting unknown or mismatched pairs before inserting anything.  Introduce
ObjectMapInsertError with InvalidHashAlgorithm and MismatchedAlgorithms
variants for clear error reporting.

Update the existing tests to unwrap successful insertions, and add tests
for same-algorithm and unknown-algorithm inputs.

Signed-off-by: Feng Wu <wufengwufengwufeng@gmail.com>
---
    rust: validate object map insert algorithms
    
    ObjectMap::insert() accepts a storage OID and a compatible OID in either
    order, but it currently checks whether oid1 uses the compatible
    algorithm, and if not, assumes oid2 is the compatible one without
    validating oid2.
    
    That means inputs with two OIDs using the same hash algorithm, or an OID
    using an unknown hash algorithm, are accepted and one of them is
    silently treated as the storage OID. This breaks the object map
    invariant that each entry must contain exactly one storage hash and one
    compatible hash.
    
    Teach ObjectMap::insert() to decode and validate both OID algorithms
    before inserting anything. The function now accepts only the two valid
    permutations: (storage, compat) and (compat, storage). Unknown
    algorithms and mismatched algorithm pairs are rejected via
    ObjectMapInsertError.
    
    The tests cover successful insertion in either order, same-algorithm
    input, and unknown-algorithm input.
    
    Tested with:
    
     * cargo fmt --all -- --check
     * git diff --check
     * cargo test

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-2350%2Fwufengwind%2Fobject-map-insert-validation-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-2350/wufengwind/object-map-insert-validation-v1
Pull-Request: https://github.com/git/git/pull/2350

 src/loose.rs | 79 ++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 73 insertions(+), 6 deletions(-)

diff --git a/src/loose.rs b/src/loose.rs
index 24accf9c33..8f6c1fb40e 100644
--- a/src/loose.rs
+++ b/src/loose.rs
@@ -510,6 +510,17 @@ pub struct ObjectMap {
     batch: Option<ObjectMemoryMap>,
 }
 
+#[derive(Debug, Clone, Copy, Eq, PartialEq)]
+pub enum ObjectMapInsertError {
+    InvalidHashAlgorithm(u32),
+    MismatchedAlgorithms {
+        storage: HashAlgorithm,
+        compat: HashAlgorithm,
+        oid1: HashAlgorithm,
+        oid2: HashAlgorithm,
+    },
+}
+
 impl ObjectMap {
     /// Create a new `ObjectMap` with the given hash algorithms.
     ///
@@ -585,19 +596,39 @@ impl ObjectMap {
     ///
     /// If `write` is true and there is a batch started, write the object into the batch as well as
     /// into the memory map.
-    pub fn insert(&mut self, oid1: &ObjectID, oid2: &ObjectID, kind: MapType, write: bool) {
-        let (compat_oid, storage_oid) =
-            if HashAlgorithm::from_u32(oid1.algo) == Some(self.mem.compat) {
+    pub fn insert(
+        &mut self,
+        oid1: &ObjectID,
+        oid2: &ObjectID,
+        kind: MapType,
+        write: bool,
+    ) -> Result<(), ObjectMapInsertError> {
+        let oid1_algo = HashAlgorithm::from_u32(oid1.algo)
+            .ok_or(ObjectMapInsertError::InvalidHashAlgorithm(oid1.algo))?;
+        let oid2_algo = HashAlgorithm::from_u32(oid2.algo)
+            .ok_or(ObjectMapInsertError::InvalidHashAlgorithm(oid2.algo))?;
+
+        let (storage_oid, compat_oid) =
+            if oid1_algo == self.mem.storage && oid2_algo == self.mem.compat {
                 (oid1, oid2)
-            } else {
+            } else if oid1_algo == self.mem.compat && oid2_algo == self.mem.storage {
                 (oid2, oid1)
+            } else {
+                return Err(ObjectMapInsertError::MismatchedAlgorithms {
+                    storage: self.mem.storage,
+                    compat: self.mem.compat,
+                    oid1: oid1_algo,
+                    oid2: oid2_algo,
+                });
             };
+
         Self::insert_into(&mut self.mem, storage_oid, compat_oid, kind);
         if write {
             if let Some(ref mut batch) = self.batch {
                 Self::insert_into(batch, storage_oid, compat_oid, kind);
             }
         }
+        Ok(())
     }
 
     fn insert_into(
@@ -729,9 +760,9 @@ mod tests {
             if *swap {
                 // Insert the item into the batch arbitrarily based on the type.  This tests that
                 // we can specify either order and we'll do the right thing.
-                map.insert(&s256, &s1, *kind, write);
+                map.insert(&s256, &s1, *kind, write).unwrap();
             } else {
-                map.insert(&s1, &s256, *kind, write);
+                map.insert(&s1, &s256, *kind, write).unwrap();
             }
         }
 
@@ -873,6 +904,42 @@ mod tests {
         );
     }
 
+    #[test]
+    fn refuses_insert_with_mismatched_algorithms() {
+        let mut map = ObjectMap::new(HashAlgorithm::SHA256, HashAlgorithm::SHA1);
+        let entries = test_entries();
+        let s256 = sha256_oid(entries[0].2);
+        let s256_other = sha256_oid(entries[1].2);
+        let s1 = sha1_oid(entries[0].1);
+        let s1_other = sha1_oid(entries[1].1);
+
+        assert!(map.insert(&s256, &s1, MapType::LooseObject, false).is_ok());
+        assert!(matches!(
+            map.insert(&s256, &s256_other, MapType::LooseObject, false),
+            Err(super::ObjectMapInsertError::MismatchedAlgorithms { .. })
+        ));
+        assert!(matches!(
+            map.insert(&s1, &s1_other, MapType::LooseObject, false),
+            Err(super::ObjectMapInsertError::MismatchedAlgorithms { .. })
+        ));
+    }
+
+    #[test]
+    fn refuses_insert_with_unknown_algorithm() {
+        let mut map = ObjectMap::new(HashAlgorithm::SHA256, HashAlgorithm::SHA1);
+        let entries = test_entries();
+        let s1 = sha1_oid(entries[0].1);
+        let invalid_oid = ObjectID {
+            hash: [0xffu8; 32],
+            algo: 99,
+        };
+
+        assert_eq!(
+            map.insert(&invalid_oid, &s1, MapType::LooseObject, false),
+            Err(super::ObjectMapInsertError::InvalidHashAlgorithm(99))
+        );
+    }
+
     #[test]
     fn looks_up_known_oids_correctly() {
         let map = test_map(false);

base-commit: ab776a62a78576513ee121424adb19597fbb7613
-- 
gitgitgadget

                 reply	other threads:[~2026-06-26 15:58 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.2350.git.git.1782489506255.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=wufengwufengwufeng@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.