From: Taylor Blau <me@ttaylorr.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
Patrick Steinhardt <ps@pks.im>,
Johannes Schindelin <Johannes.Schindelin@gmx.de>,
Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH v2 3/3] index-pack: allow revisiting REF_DELTA chains
Date: Tue, 6 May 2025 22:08:21 -0400 [thread overview]
Message-ID: <aBrAlc8k8uxzrIV9@nand.local> (raw)
In-Reply-To: <1358039b2f3bf893fffc63c1065f1d6862b74957.1745871885.git.gitgitgadget@gmail.com>
On Mon, Apr 28, 2025 at 08:24:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> The crux of the matter is how the algorithm works when the REF_DELTAs
> point to base objects that exist in the local repository.
Hmm. I'm having trouble squaring this with these next two paragraphs:
> Consider the case where the packfile has two REF_DELTA objects, A and B,
> and the delta chain looks like "A depends on B" and "B depends on C" for
> some third object C, where C is already in the current repository. The
> algorithm _should_ start with all objects that depend on C, finding B,
> and then moving on to all objects depending on B, finding A.
>
> However, if the repository also already has object B, then the delta
> chain can be analyzed in a different order. The deltas with base B can
> be analyzed first, finding A, and then the deltas with base C are
> analyzed, finding B. The algorithm currently continues to look for
> objects that depend on B, finding A again. This fails due to A's
> 'real_type' member already being overwritten from OBJ_REF_DELTA to the
> correct object type.
ISTM that this A->B->C chain is a problem because (in the above example)
the server sent B as a REF_DELTA base for A but also had its own
pre-existing copy of B.
But the first quoted sentence suggests that the issue is with REF_DELTAs
that point to base objects that exist in the local repository. Does
"point to" mean that the REF_DELTA's base is the local object, or that
the local object itself was sent as a REF_DELTA against some other base?
I haven't fully wrapped my head around the implications of this all yet,
but I think that it's the former, though admittedly even typing this I
am not quite sure of myself. I *think* that doing this is OK if the only
path from base objects to their corresponding deltas is unique, and/or
there were no such paths at all.
I'm trying to think through the implications of this against my
series[1] from a while ago that converts OFS_DELTAs that weren't usable
as part of verbatim pack-reuse into REF_DELTAs. There are two cases
there that I was considering:
- For some (delta, base) pair in a pack, there was an additional copy
of 'base' in some other pack, and the MIDX chose the copy from that
pack. That forms what I'm calling a "cross-pack" delta. This isn't
currently reusable as an OFS_DELTA for a variety of reasons, but is
OK as a REF_DELTA provided we know that the client either already
has the base object or we are sending it as part of the pack anyway.
- The other case is that we the client wants the delta-half of a
delta/base-pair, but not the base object. In this case, we can't
currently reuse the OFS_DELTA verbatim, but could if we converted it
to a REF_DELTA based on the knowledge that the client already has
the base object.
The latter is doable based on the information in the wants/haves bitmap.
The process there looks like: determine that the client doesn't want the
base object, realize that its bit is set in the 'haves' bitmap, and then
convert the delta object from a OFS_DELTA to a REF_DELTA.
But I think that all breaks for older clients that don't meet the unique
paths condition above. Does that sound right to you?
I think the cross-pack case is fine, provided we know ahead of time that
the client doesn't have the (converted-to-REF_DELTA) delta object in its
local copy.
Unfortunately, I think this means that [1] is a bit of a dead-end for
serves that have older clients (running a version that does not include
this patch). At least, I think that's true if we can construct a
situation where the server sends a REF_DELTA that it thinks the client
doesn't have but actually does. I'm not immediately sure what such a
situation would look like beyond cases like: "the client verbatim asked
for an object it already has, but isn't reachable from the set of
provided 'haves'".
Thanks,
Taylor
[1]: https://lore.kernel.org/git/cover.1728505840.git.me@ttaylorr.com/
next prev parent reply other threads:[~2025-05-07 2:08 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-23 17:40 [PATCH 0/3] Fix REF_DELTA chain bug in 'git index-pack' Derrick Stolee via GitGitGadget
2025-04-23 17:40 ` [PATCH 1/3] test-tool: add pack-deltas helper Derrick Stolee via GitGitGadget
2025-04-23 19:26 ` Junio C Hamano
2025-04-23 19:32 ` Derrick Stolee
2025-04-24 19:41 ` Junio C Hamano
2025-04-24 20:06 ` Derrick Stolee
2025-04-24 20:56 ` Junio C Hamano
2025-04-25 4:34 ` Patrick Steinhardt
2025-04-25 9:34 ` Johannes Schindelin
2025-04-25 9:45 ` Patrick Steinhardt
2025-04-25 9:51 ` Johannes Schindelin
2025-04-25 16:27 ` Junio C Hamano
2025-04-28 15:22 ` Derrick Stolee
2025-04-28 16:37 ` Junio C Hamano
2025-04-28 18:59 ` Derrick Stolee
2025-04-28 20:35 ` Junio C Hamano
2025-04-23 17:40 ` [PATCH 2/3] t5309: create failing test for 'git index-pack' Derrick Stolee via GitGitGadget
2025-04-23 19:37 ` Junio C Hamano
2025-04-23 17:40 ` [PATCH 3/3] index-pack: allow revisiting REF_DELTA chains Derrick Stolee via GitGitGadget
2025-04-24 21:41 ` Junio C Hamano
2025-04-25 3:49 ` Derrick Stolee
2025-04-28 20:24 ` [PATCH v2 0/3] Fix REF_DELTA chain bug in 'git index-pack' Derrick Stolee via GitGitGadget
2025-04-28 20:24 ` [PATCH v2 1/3] test-tool: add pack-deltas helper Derrick Stolee via GitGitGadget
2025-04-28 20:24 ` [PATCH v2 2/3] t5309: create failing test for 'git index-pack' Derrick Stolee via GitGitGadget
2025-04-28 20:24 ` [PATCH v2 3/3] index-pack: allow revisiting REF_DELTA chains Derrick Stolee via GitGitGadget
2025-05-07 2:08 ` Taylor Blau [this message]
2025-05-07 13:47 ` Derrick Stolee
2025-04-28 22:40 ` [PATCH v2 0/3] Fix REF_DELTA chain bug in 'git index-pack' Junio C Hamano
2025-04-29 5:33 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aBrAlc8k8uxzrIV9@nand.local \
--to=me@ttaylorr.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=ps@pks.im \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).