git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Taylor Blau <me@ttaylorr.com>,
	 git@vger.kernel.org,  Elijah Newren <newren@gmail.com>,
	 Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH 00/11] pack-bitmap: convert offset to ref deltas where possible
Date: Fri, 11 Oct 2024 09:23:43 -0700	[thread overview]
Message-ID: <xmqqy12uhals.fsf@gitster.g> (raw)
In-Reply-To: <20241011080149.GA560372@coredump.intra.peff.net> (Jeff King's message of "Fri, 11 Oct 2024 04:01:49 -0400")

Jeff King <peff@peff.net> writes:

> On Fri, Oct 11, 2024 at 03:54:51AM -0400, Jeff King wrote:
>
>> [1] I wish we had good names to distinguish the various cases, because
>>     the term "reuse" is kind of overloaded. The "slower" regular
>>     object-sending path may still reuse verbatim bytes found in an
>>     on-disk path. But this "blit out matching parts of a pack without
>>     otherwise considering the objects" feature happens outside of that.
>>     We called it "pack reuse" back in 2013, but that was not a good name
>>     even then. I don't have a good suggestion, though.
>
> Actually, confusing things more, there are really _three_ layers of
> reuse:
>
>   1. At the beginning of a pack, we can blit out the bytes for objects
>      starting from the beginning of the pack that are being sent (we
>      know any delta will be satisfied since its base comes before it).

Yes, I wouldn't be worried about that one.  The data encoded as an
ofs-delta in this section already point at their base correctly in
the original pack, and in the resulting pack.

>   2. After that, we process objects one by one, but do so very cheaply
>      by just deciding if we can blit them out one by one, fixing up
>      delta base offsets to account for gaps.

This is the part I said "we have to remember where the base was emitted
and subtract it from the offset of the delta anyway even if we are
reusing delta from the same pack, so what do we need a separate code path
for this?" in my initial response.

I guess, "fixing up" could be done by using the difference between
offsets in the original pack for this step, which would be an
unfortunate design that prevents it from getting reused.

>   3. Otherwise, we generate an object_entry struct in packing_data for
>      them, try to find new deltas, and so on. We may then reuse the
>      on-disk bytes after deciding they're suitable.

It is a bit unfortunate, if we were to trust the existing delta base
selection in the pack like we did since Feb 2006 [*], we should be
omitting the "try to find new deltas" step.  Perhaps that comes for
free as the object_entry knows that our object has a delta_base?

> We call all of these "reuse", and certainly both (1) and (2) are "pack
> reuse", but I think that term is sufficiently vague that it could apply
> to (3) as well.
>
> -Peff

  reply	other threads:[~2024-10-11 16:23 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-09 20:30 [PATCH 00/11] pack-bitmap: convert offset to ref deltas where possible Taylor Blau
2024-10-09 20:31 ` [PATCH 01/11] pack-bitmap.c: do not pass `pack_pos` to `try_partial_reuse()` Taylor Blau
2024-10-09 20:31 ` [PATCH 02/11] pack-bitmap.c: avoid unnecessary `offset_to_pack_pos()` Taylor Blau
2024-10-09 20:31 ` [PATCH 03/11] pack-bitmap.c: delay calling 'offset_to_pack_pos()' Taylor Blau
2024-10-09 20:31 ` [PATCH 04/11] pack-bitmap.c: compare `base_offset` to `delta_obj_offset` Taylor Blau
2024-10-09 20:31 ` [PATCH 05/11] pack-bitmap.c: extract `find_base_bitmap_pos()` Taylor Blau
2024-10-09 20:31 ` [PATCH 06/11] pack-bitmap: drop `from_midx` field from `bitmapped_pack` Taylor Blau
2024-10-09 20:31 ` [PATCH 07/11] write_reused_pack_one(): translate bit positions directly Taylor Blau
2024-10-11  8:16   ` Jeff King
2024-11-04 20:36     ` Taylor Blau
2024-10-09 20:31 ` [PATCH 08/11] t5332: enable OFS_DELTAs via test_pack_objects_reused Taylor Blau
2024-10-11  8:19   ` Jeff King
2024-11-04 20:50     ` Taylor Blau
2024-10-09 20:31 ` [PATCH 09/11] pack-bitmap: enable cross-pack delta reuse Taylor Blau
2024-10-11  8:31   ` Jeff King
2024-11-04 21:00     ` Taylor Blau
2024-10-09 20:31 ` [PATCH 10/11] pack-bitmap.c: record whether the result was filtered Taylor Blau
2024-10-11  8:35   ` Jeff King
2024-11-04 21:01     ` Taylor Blau
2024-10-09 20:31 ` [PATCH 11/11] pack-bitmap: enable reusing deltas with base objects in 'haves' bitmap Taylor Blau
2024-10-10 16:46 ` [PATCH 00/11] pack-bitmap: convert offset to ref deltas where possible Junio C Hamano
2024-10-10 17:07   ` Taylor Blau
2024-10-10 20:20     ` Junio C Hamano
2024-10-10 20:32       ` Taylor Blau
2024-10-11  7:54       ` Jeff King
2024-10-11  8:01         ` Jeff King
2024-10-11 16:23           ` Junio C Hamano [this message]
2024-10-11  8:38 ` Jeff King
2024-11-19 23:08   ` Taylor Blau
2024-11-19 23:34     ` Taylor Blau
2024-12-18 12:57       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy12uhals.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).