All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
	Patrick Steinhardt <ps@pks.im>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH v2 3/3] index-pack: allow revisiting REF_DELTA chains
Date: Tue, 6 May 2025 22:08:21 -0400	[thread overview]
Message-ID: <aBrAlc8k8uxzrIV9@nand.local> (raw)
In-Reply-To: <1358039b2f3bf893fffc63c1065f1d6862b74957.1745871885.git.gitgitgadget@gmail.com>

On Mon, Apr 28, 2025 at 08:24:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> The crux of the matter is how the algorithm works when the REF_DELTAs
> point to base objects that exist in the local repository.

Hmm. I'm having trouble squaring this with these next two paragraphs:

> Consider the case where the packfile has two REF_DELTA objects, A and B,
> and the delta chain looks like "A depends on B" and "B depends on C" for
> some third object C, where C is already in the current repository. The
> algorithm _should_ start with all objects that depend on C, finding B,
> and then moving on to all objects depending on B, finding A.
>
> However, if the repository also already has object B, then the delta
> chain can be analyzed in a different order. The deltas with base B can
> be analyzed first, finding A, and then the deltas with base C are
> analyzed, finding B. The algorithm currently continues to look for
> objects that depend on B, finding A again. This fails due to A's
> 'real_type' member already being overwritten from OBJ_REF_DELTA to the
> correct object type.

ISTM that this A->B->C chain is a problem because (in the above example)
the server sent B as a REF_DELTA base for A but also had its own
pre-existing copy of B.

But the first quoted sentence suggests that the issue is with REF_DELTAs
that point to base objects that exist in the local repository. Does
"point to" mean that the REF_DELTA's base is the local object, or that
the local object itself was sent as a REF_DELTA against some other base?

I haven't fully wrapped my head around the implications of this all yet,
but I think that it's the former, though admittedly even typing this I
am not quite sure of myself. I *think* that doing this is OK if the only
path from base objects to their corresponding deltas is unique, and/or
there were no such paths at all.

I'm trying to think through the implications of this against my
series[1] from a while ago that converts OFS_DELTAs that weren't usable
as part of verbatim pack-reuse into REF_DELTAs. There are two cases
there that I was considering:

  - For some (delta, base) pair in a pack, there was an additional copy
    of 'base' in some other pack, and the MIDX chose the copy from that
    pack. That forms what I'm calling a "cross-pack" delta. This isn't
    currently reusable as an OFS_DELTA for a variety of reasons, but is
    OK as a REF_DELTA provided we know that the client either already
    has the base object or we are sending it as part of the pack anyway.

  - The other case is that we the client wants the delta-half of a
    delta/base-pair, but not the base object. In this case, we can't
    currently reuse the OFS_DELTA verbatim, but could if we converted it
    to a REF_DELTA based on the knowledge that the client already has
    the base object.

The latter is doable based on the information in the wants/haves bitmap.
The process there looks like: determine that the client doesn't want the
base object, realize that its bit is set in the 'haves' bitmap, and then
convert the delta object from a OFS_DELTA to a REF_DELTA.

But I think that all breaks for older clients that don't meet the unique
paths condition above. Does that sound right to you?

I think the cross-pack case is fine, provided we know ahead of time that
the client doesn't have the (converted-to-REF_DELTA) delta object in its
local copy.

Unfortunately, I think this means that [1] is a bit of a dead-end for
serves that have older clients (running a version that does not include
this patch). At least, I think that's true if we can construct a
situation where the server sends a REF_DELTA that it thinks the client
doesn't have but actually does. I'm not immediately sure what such a
situation would look like beyond cases like: "the client verbatim asked
for an object it already has, but isn't reachable from the set of
provided 'haves'".

Thanks,
Taylor

[1]: https://lore.kernel.org/git/cover.1728505840.git.me@ttaylorr.com/

  reply	other threads:[~2025-05-07  2:08 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-23 17:40 [PATCH 0/3] Fix REF_DELTA chain bug in 'git index-pack' Derrick Stolee via GitGitGadget
2025-04-23 17:40 ` [PATCH 1/3] test-tool: add pack-deltas helper Derrick Stolee via GitGitGadget
2025-04-23 19:26   ` Junio C Hamano
2025-04-23 19:32     ` Derrick Stolee
2025-04-24 19:41   ` Junio C Hamano
2025-04-24 20:06     ` Derrick Stolee
2025-04-24 20:56       ` Junio C Hamano
2025-04-25  4:34   ` Patrick Steinhardt
2025-04-25  9:34     ` Johannes Schindelin
2025-04-25  9:45       ` Patrick Steinhardt
2025-04-25  9:51         ` Johannes Schindelin
2025-04-25 16:27         ` Junio C Hamano
2025-04-28 15:22           ` Derrick Stolee
2025-04-28 16:37             ` Junio C Hamano
2025-04-28 18:59               ` Derrick Stolee
2025-04-28 20:35                 ` Junio C Hamano
2025-04-23 17:40 ` [PATCH 2/3] t5309: create failing test for 'git index-pack' Derrick Stolee via GitGitGadget
2025-04-23 19:37   ` Junio C Hamano
2025-04-23 17:40 ` [PATCH 3/3] index-pack: allow revisiting REF_DELTA chains Derrick Stolee via GitGitGadget
2025-04-24 21:41   ` Junio C Hamano
2025-04-25  3:49     ` Derrick Stolee
2025-04-28 20:24 ` [PATCH v2 0/3] Fix REF_DELTA chain bug in 'git index-pack' Derrick Stolee via GitGitGadget
2025-04-28 20:24   ` [PATCH v2 1/3] test-tool: add pack-deltas helper Derrick Stolee via GitGitGadget
2025-04-28 20:24   ` [PATCH v2 2/3] t5309: create failing test for 'git index-pack' Derrick Stolee via GitGitGadget
2025-04-28 20:24   ` [PATCH v2 3/3] index-pack: allow revisiting REF_DELTA chains Derrick Stolee via GitGitGadget
2025-05-07  2:08     ` Taylor Blau [this message]
2025-05-07 13:47       ` Derrick Stolee
2025-04-28 22:40   ` [PATCH v2 0/3] Fix REF_DELTA chain bug in 'git index-pack' Junio C Hamano
2025-04-29  5:33     ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBrAlc8k8uxzrIV9@nand.local \
    --to=me@ttaylorr.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.