git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Patrick Steinhardt <ps@pks.im>
Cc: Junio C Hamano <gitster@pobox.com>, Ilya K <me@0upti.me>,
	git@vger.kernel.org
Subject: Re: git 2.46.0 crashes when trying to verify-pack outside of a repo
Date: Mon, 2 Sep 2024 13:18:38 +0000	[thread overview]
Message-ID: <ZtW7LtQEobPpVB99@tapette.crustytoothpaste.net> (raw)
In-Reply-To: <ZtT8p06fdTwXO7iX@tanuki>

[-- Attachment #1: Type: text/plain, Size: 3121 bytes --]

On 2024-09-01 at 23:45:43, Patrick Steinhardt wrote:
> Unfortunately, this once again uncovers a deeper issue: neither the
> packfile nor their index encode the object format they use. So while
> falling back to SHA1 papers over the issue, it means that we misparse
> SHA256 indices. Also, we misparse SHA1 indices if we happen to be in a
> SHA256 repository. E.g. when parsing a SHA256 file in a SHA1 repo:
> 
>     $ git index-pack --verify '/tmp/git-tests/trash directory.t5300-pack-object/repo/.git/objects/pack/pack-aa45f7f08f043c9f0388f1844a2a797587254e249919b35ac9dc2b52c1aada29.pack'
>     error: wrong index v2 file size in /tmp/git-tests/trash directory.t5300-pack-object/repo/.git/objects/pack/pack-aa45f7f08f043c9f0388f1844a2a797587254e249919b35ac9dc2b52c1aada29.idx
>     fatal: Cannot open existing pack idx file for '/tmp/git-tests/trash directory.t5300-pack-object/repo/.git/objects/pack/pack-aa45f7f08f043c9f0388f1844a2a797587254e249919b35ac9dc2b52c1aada29.idx'
> 
> The error message isn't even properly indicating what the actual issue
> is.

Yes, this is also true of other formats like the index as well, but
there we know it must be of the same format as the rest of the
repository.

I noticed this during writing the SHA-256 series, and it's inconvenient.
If you blame some of the tests that add the `--object-format` entry,
I wrote them.

> One potential solution would be to try and derive the object format from
> the hash that the packfile index name has. But that is quite roundabout
> and rather ugly, and packfiles may not necessarily have that hash in the
> first place. It would also become potentially ambiguous in the future if
> we were to ever adopt another hash that has the same length as either
> SHA1 or SHA256.

Yes, we've decided not to derive things by their length except in the
dumb HTTP protocol for this reason.

> So we basically have three different options:
> 
>   - Accept that we just don't handle this case correctly and let the
>     code error out. This pessimizes all hashes but SHA256.
> 
>   - Bail out when outside of a repository when `--object-format=` wasn't
>     given. This pessimizes all hashes, but gives a clear indicator to
>     the user why things don't work.

This is what I would recommend.

>   - Introduce packfiles v3 and encode the object format into the header.
>     Then do either (1) or (2) on top.

I think we have pack v3 already (which is the same as v2), and v4 was
for an experimental format that never landed fully.  Maybe v5?

If you wanted to do this, you could add support for arbitrary chunks,
like with multi-pack indexes, that would allow for extensibility in the
future.  However, you'd also need some protocol capabilities if you
want to send pack v5 or certain chunks over the protocol.

> The last option is of course the cleanest, but also the most involved.

I'd personally recommend just requiring the `--object-format=` option,
but of course if you want to write pack v5, don't let me stop you.
-- 
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

  reply	other threads:[~2024-09-02 13:18 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-31  6:46 git 2.46.0 crashes when trying to verify-pack outside of a repo Ilya K
2024-09-01 15:26 ` Junio C Hamano
2024-09-01 23:45   ` Patrick Steinhardt
2024-09-02 13:18     ` brian m. carlson [this message]
2024-09-02 13:47       ` Patrick Steinhardt
2024-09-03 15:52         ` Junio C Hamano
2024-09-04  6:26 ` [PATCH] builtin/index-pack: fix segfaults when running " Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZtW7LtQEobPpVB99@tapette.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@0upti.me \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).