From: Taylor Blau <me@ttaylorr.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
Elijah Newren <newren@gmail.com>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v3 3/9] finalize_object_file(): implement collision check
Date: Mon, 16 Sep 2024 11:54:14 -0400 [thread overview]
Message-ID: <ZuhUprZXvcRvgpp5@nand.local> (raw)
In-Reply-To: <ZugMUv1xbnjYH-el@pks.im>
On Mon, Sep 16, 2024 at 12:45:38PM +0200, Patrick Steinhardt wrote:
> This function compares the exact contents, but isn't that wrong? The
> contents may differ even though the object is the same because the
> object hash is derived from the uncompressed data, whereas we store
> compressed data on disk.
>
> So this might trigger when you have different zlib versions, but also if
> you configure core.compression differently.
Oh, shoot -- you're right for when we call finalize_object_file() on
loose objects.
For packfiles and other spots where we use the hashfile API, the
name of the file depends on the checksum'd contents, so we're safe
there. But for loose objects the, the name of the file is based on the
object hash, which is the hash of the uncompressed contents.
So I think we would need something like this on top:
--- 8< ---
diff --git a/object-file.c b/object-file.c
index 84dd7d0fab..2f1616651e 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1992,10 +1992,8 @@ static int check_collision(const char *filename_a, const char *filename_b)
return ret;
}
-/*
- * Move the just written object into its final resting place.
- */
-int finalize_object_file(const char *tmpfile, const char *filename)
+static int finalize_object_file_1(const char *tmpfile, const char *filename,
+ unsigned check_collisions)
{
struct stat st;
int ret = 0;
@@ -2044,6 +2042,14 @@ int finalize_object_file(const char *tmpfile, const char *filename)
return 0;
}
+/*
+ * Move the just written object into its final resting place.
+ */
+int finalize_object_file(const char *tmpfile, const char *filename)
+{
+ return finalize_object_file_1(tmpfile, filename, 1);
+}
+
static void hash_object_file_literally(const struct git_hash_algo *algo,
const void *buf, unsigned long len,
const char *type, struct object_id *oid)
@@ -2288,7 +2294,7 @@ static int write_loose_object(const struct object_id *oid, char *hdr,
warning_errno(_("failed utime() on %s"), tmp_file.buf);
}
- return finalize_object_file(tmp_file.buf, filename.buf);
+ return finalize_object_file_1(tmp_file.buf, filename.buf, 0);
}
static int freshen_loose_object(const struct object_id *oid)
@@ -2410,7 +2416,7 @@ int stream_loose_object(struct input_stream *in_stream, size_t len,
strbuf_release(&dir);
}
- err = finalize_object_file(tmp_file.buf, filename.buf);
+ err = finalize_object_file_1(tmp_file.buf, filename.buf, 0);
if (!err && compat)
err = repo_add_loose_object_map(the_repository, oid, &compat_oid);
cleanup:
--- >8 ---
But I'd like for some others to take a look at this before I send a new
round that includes this change.
Thanks for catching that!
Thanks,
Taylor
next prev parent reply other threads:[~2024-09-16 15:54 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-01 16:03 [PATCH 0/4] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Taylor Blau
2024-09-01 16:03 ` [PATCH 1/4] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-02 13:41 ` Patrick Steinhardt
2024-09-03 19:34 ` Taylor Blau
2024-09-01 16:03 ` [PATCH 2/4] hash.h: scaffolding for _fast hashing variants Taylor Blau
2024-09-02 13:41 ` Patrick Steinhardt
2024-09-03 17:27 ` Junio C Hamano
2024-09-03 19:52 ` Taylor Blau
2024-09-03 20:47 ` Junio C Hamano
2024-09-03 21:24 ` Taylor Blau
2024-09-04 7:05 ` Patrick Steinhardt
2024-09-04 14:53 ` Junio C Hamano
2024-09-03 19:40 ` Taylor Blau
2024-09-01 16:03 ` [PATCH 3/4] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-02 13:41 ` Patrick Steinhardt
2024-09-03 19:43 ` Taylor Blau
2024-09-01 16:03 ` [PATCH 4/4] csum-file.c: use fast SHA-1 implementation when available Taylor Blau
2024-09-02 13:41 ` Patrick Steinhardt
2024-09-03 1:22 ` brian m. carlson
2024-09-03 19:50 ` Taylor Blau
2024-09-02 3:41 ` [PATCH 0/4] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Junio C Hamano
2024-09-03 19:48 ` Taylor Blau
2024-09-03 20:44 ` Junio C Hamano
2024-09-02 14:08 ` brian m. carlson
2024-09-03 19:47 ` Taylor Blau
2024-09-03 22:41 ` Junio C Hamano
2024-09-04 14:01 ` brian m. carlson
2024-09-05 10:37 ` Jeff King
2024-09-05 15:41 ` Junio C Hamano
2024-09-05 16:23 ` Taylor Blau
2024-09-05 16:51 ` Junio C Hamano
2024-09-05 17:04 ` Taylor Blau
2024-09-05 17:51 ` Taylor Blau
2024-09-05 20:21 ` Taylor Blau
2024-09-05 20:27 ` Jeff King
2024-09-05 21:27 ` Junio C Hamano
2024-09-05 15:11 ` [PATCH v2 " Taylor Blau
2024-09-05 15:12 ` [PATCH v2 1/4] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-05 15:12 ` [PATCH v2 2/4] hash.h: scaffolding for _fast hashing variants Taylor Blau
2024-09-05 15:12 ` [PATCH v2 3/4] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-05 15:12 ` [PATCH v2 4/4] csum-file.c: use fast SHA-1 implementation when available Taylor Blau
2024-09-06 19:46 ` [PATCH v3 0/9] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Taylor Blau
2024-09-06 19:46 ` [PATCH v3 1/9] finalize_object_file(): check for name collision before renaming Taylor Blau
2024-09-06 19:46 ` [PATCH v3 2/9] finalize_object_file(): refactor unlink_or_warn() placement Taylor Blau
2024-09-06 19:46 ` [PATCH v3 3/9] finalize_object_file(): implement collision check Taylor Blau
2024-09-06 21:44 ` Junio C Hamano
2024-09-06 21:51 ` Chris Torek
2024-09-10 6:53 ` Jeff King
2024-09-10 15:14 ` Junio C Hamano
2024-09-16 10:45 ` Patrick Steinhardt
2024-09-16 15:54 ` Taylor Blau [this message]
2024-09-16 16:03 ` Taylor Blau
2024-09-17 20:40 ` Junio C Hamano
2024-09-06 19:46 ` [PATCH v3 4/9] pack-objects: use finalize_object_file() to rename pack/idx/etc Taylor Blau
2024-09-06 19:46 ` [PATCH v3 5/9] i5500-git-daemon.sh: use compile-able version of Git without OpenSSL Taylor Blau
2024-09-11 6:10 ` Jeff King
2024-09-11 6:12 ` Jeff King
2024-09-12 20:28 ` Junio C Hamano
2024-09-11 15:28 ` Junio C Hamano
2024-09-11 21:23 ` Jeff King
2024-09-06 19:46 ` [PATCH v3 6/9] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-06 19:46 ` [PATCH v3 7/9] hash.h: scaffolding for _fast hashing variants Taylor Blau
2024-09-06 19:46 ` [PATCH v3 8/9] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-06 19:46 ` [PATCH v3 9/9] csum-file.c: use fast SHA-1 implementation when available Taylor Blau
2024-09-06 21:50 ` [PATCH v3 0/9] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Junio C Hamano
2024-09-24 17:32 ` [PATCH v4 0/8] " Taylor Blau
2024-09-24 17:32 ` [PATCH v4 1/8] finalize_object_file(): check for name collision before renaming Taylor Blau
2024-09-25 17:02 ` Junio C Hamano
2024-09-24 17:32 ` [PATCH v4 2/8] finalize_object_file(): refactor unlink_or_warn() placement Taylor Blau
2024-09-24 17:32 ` [PATCH v4 3/8] finalize_object_file(): implement collision check Taylor Blau
2024-09-24 20:37 ` Jeff King
2024-09-24 21:59 ` Taylor Blau
2024-09-24 22:20 ` Jeff King
2024-09-25 18:06 ` Taylor Blau
2024-09-24 21:32 ` Junio C Hamano
2024-09-24 22:02 ` Taylor Blau
2024-09-24 17:32 ` [PATCH v4 4/8] pack-objects: use finalize_object_file() to rename pack/idx/etc Taylor Blau
2024-09-24 21:34 ` Junio C Hamano
2024-09-24 17:32 ` [PATCH v4 5/8] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-24 17:32 ` [PATCH v4 6/8] hash.h: scaffolding for _unsafe hashing variants Taylor Blau
2024-09-24 17:32 ` [PATCH v4 7/8] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-24 17:32 ` [PATCH v4 8/8] csum-file.c: use unsafe SHA-1 implementation when available Taylor Blau
2024-09-24 20:52 ` [PATCH v4 0/8] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Jeff King
2024-09-25 16:58 ` Elijah Newren
2024-09-25 17:11 ` Junio C Hamano
2024-09-25 17:22 ` Taylor Blau
2024-09-25 17:22 ` Taylor Blau
2024-09-26 15:22 ` [PATCH v5 " Taylor Blau
2024-09-26 15:22 ` [PATCH v5 1/8] finalize_object_file(): check for name collision before renaming Taylor Blau
2024-09-26 15:22 ` [PATCH v5 2/8] finalize_object_file(): refactor unlink_or_warn() placement Taylor Blau
2024-09-26 15:22 ` [PATCH v5 3/8] finalize_object_file(): implement collision check Taylor Blau
2024-09-26 15:22 ` [PATCH v5 4/8] pack-objects: use finalize_object_file() to rename pack/idx/etc Taylor Blau
2024-09-26 15:22 ` [PATCH v5 5/8] sha1: do not redefine `platform_SHA_CTX` and friends Taylor Blau
2024-09-26 15:22 ` [PATCH v5 6/8] hash.h: scaffolding for _unsafe hashing variants Taylor Blau
2024-09-26 15:22 ` [PATCH v5 7/8] Makefile: allow specifying a SHA-1 for non-cryptographic uses Taylor Blau
2024-09-26 15:22 ` [PATCH v5 8/8] csum-file.c: use unsafe SHA-1 implementation when available Taylor Blau
2024-09-26 22:47 ` [PATCH v5 0/8] hash.h: support choosing a separate SHA-1 for non-cryptographic uses Elijah Newren
2024-09-27 0:44 ` Junio C Hamano
2024-09-27 3:57 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZuhUprZXvcRvgpp5@nand.local \
--to=me@ttaylorr.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).