git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Han Young <hanyang.tony@bytedance.com>
Cc: git@vger.kernel.org,  calvinwan@google.com,
	 jonathantanmy@google.com, sokcevic@google.com,
	 phillip.wood123@gmail.com
Subject: Re: [PATCH v2 0/3] repack: pack everything into promisor packfile in partial repos
Date: Tue, 08 Oct 2024 14:57:42 -0700	[thread overview]
Message-ID: <xmqq4j5mz295.fsf@gitster.g> (raw)
In-Reply-To: <20241008081350.8950-1-hanyang.tony@bytedance.com> (Han Young's message of "Tue, 8 Oct 2024 16:13:47 +0800")

Han Young <hanyang.tony@bytedance.com> writes:

> As suggested by Jonathan[1], there are number of ways to fix this issue.
> We have already explored some of them in this thread, and so far none of them
> is satisfiable. Calvin and I tried to address the problem from fetch-pack side
> and rev-list side. But the fix either consumes too much CPU power or results
> in inefficient bandwidth use.
>
> So let's attack the problem from repack side. The goal is to prevent repack
> from discarding local objects, previously it is done by carefully
> separating promisor objects and normal objects in rev-list.
> The implementation is flawed and no solution have been found so far.
> Instead, we can get ride of rev-list and just pack everything into promisor
> files. This way, no objects would be lost.
>
> By using 'repack everything', repacking requires less work and we are not
> using more bandwidth.

OK, perhaps.

> Packing local objects into promisor packfiles means that it is no longer
> possible to detect if an object is missing due to repository corruption
> or because we need to fetch it from a promisor remote.

Is it true that without doing this, we can tell between these two
cases, though?  More importantly, even if it is true, would there be
a practical difference?

In the sample scenario used in [1/3] where you created C2/T2/B2 on
top of C1/T1/B1 (which came from a promisor remote), somebody else
built C3/T3/B3 on top, and it came back from the promisor remote,
you could lose 3's objects and 1's objects and they can be refetched
but even if you lose 2's objects, since 3's objects are building on
top of them, you should be able to fetch them from the promisor
remote just like objects from 1 and 3, no?  So strictly speaking,
missing 2's objects may be "repository corruption" while missing 1's
and 3's objects may not be, would there be a practical use for that
information?

> Promisor objects packing does not benefiting from the history and
> path based delta calculation, and GC does not remove unreachable promisor
> objects. By packing locally created normal objects into promisor packfile,
> normal objects are converted into promisor objects. However, in partial cloned
> repos, the number of locally created objects are small compared to promisor
> objects. The impact should be negligible.

> [1] https://lore.kernel.org/git/20240813004508.2768102-1-jonathantanmy@google.com/
>
> *** Changes since v1 ***
> Added tradeoffs in cover letter.
> Fixed some partial clone test cases.
> Updated partial clone documentation.

These patches are based on the tip of master before 365529e1 (Merge
branch 'ps/leakfixes-part-7', 2024-10-02), which will give mildly
annoying conflicts when merged to 'seen'.

I've managed to apply and then merge, so unless review discussions
find needs for updates, there is no need for immediate reroll, but
if you end up having to update these patches, it is a good idea to
rebase the topic on top of v2.47.0 that was released early this
week, as we are now entering a new development cycle.

Thanks.


>
> Han Young (3):
>   repack: pack everything into packfile
>   t0410: adapt tests to repack changes
>   partial-clone: update doc
>
>  Documentation/technical/partial-clone.txt |  16 +-
>  builtin/repack.c                          | 257 ++++++++++++----------
>  t/t0410-partial-clone.sh                  |  68 +-----
>  t/t5616-partial-clone.sh                  |   9 +-
>  4 files changed, 157 insertions(+), 193 deletions(-)

  parent reply	other threads:[~2024-10-08 21:57 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-02  7:31 [PATCH 0/1] revision: fix reachable objects being gc'ed in no blob clone repo Han Young
2024-08-02  7:31 ` [PATCH 1/1] revision: don't set parents as uninteresting if exclude promisor objects Han Young
2024-08-02 16:45   ` Junio C Hamano
2024-08-12 12:34     ` [External] " 韩仰
2024-08-12 16:09       ` Junio C Hamano
2024-08-22  8:28         ` 韩仰
2024-08-13  0:45 ` [PATCH 0/1] revision: fix reachable objects being gc'ed in no blob clone repo Jonathan Tan
2024-08-13 17:18   ` Jonathan Tan
2024-08-14  4:10     ` Junio C Hamano
2024-08-14 19:30       ` Jonathan Tan
2024-08-23 12:43 ` [WIP v2 0/4] " Han Young
2024-08-23 12:43   ` [WIP v2 1/4] packfile: split promisor objects oidset into two Han Young
2024-08-23 12:43   ` [WIP v2 2/4] revision: add exclude-promisor-pack-objects option Han Young
2024-08-23 12:43   ` [WIP v2 3/4] revision: don't mark commit as UNINTERESTING if --exclude-promisor-objects is set Han Young
2024-08-23 12:43   ` [WIP v2 4/4] repack: use new exclude promisor pack objects option Han Young
2024-09-19 23:47 ` [PATCH 0/2] revision: fix reachable commits being gc'ed in partial repo Calvin Wan
2024-09-19 23:47 ` [PATCH 1/2] packfile: split promisor objects oidset into two Calvin Wan
2024-09-22  6:37   ` Junio C Hamano
2024-09-19 23:47 ` [PATCH 2/2] fetch-pack.c: do not declare local commits as "have" in partial repos Calvin Wan
2024-09-22  6:53   ` Junio C Hamano
2024-09-22 16:41     ` Junio C Hamano
2024-09-23  3:44     ` [External] " 韩仰
2024-09-23 16:21       ` Junio C Hamano
2024-10-02 22:35       ` Calvin Wan
2024-09-25  7:20 ` [PATCH 0/2] repack: pack everything into promisor packfile " Han Young
2024-09-25  7:20   ` [PATCH 1/2] repack: pack everything into packfile Han Young
2024-09-25  7:20   ` [PATCH 2/2] t0410: adapt tests to repack changes Han Young
2024-09-25 15:20   ` [PATCH 0/2] repack: pack everything into promisor packfile in partial repos Phillip Wood
2024-09-25 16:48     ` Junio C Hamano
2024-09-25 17:03   ` Junio C Hamano
2024-10-01 19:17 ` Missing Promisor Objects in Partial Repo Design Doc Calvin Wan
2024-10-01 19:35   ` Junio C Hamano
2024-10-02  2:54   ` Junio C Hamano
2024-10-02  7:57     ` [External] " Han Young
2024-10-08 21:35     ` Calvin Wan
2024-10-09  6:46       ` [External] " Han Young
2024-10-09 18:34         ` Jonathan Tan
2024-10-12  2:05           ` Jonathan Tan
2024-10-12  3:30             ` Han Young
2024-10-14 17:52               ` Jonathan Tan
2024-10-09 18:53     ` Jonathan Tan
2024-10-08  8:13 ` [PATCH v2 0/3] repack: pack everything into promisor packfile in partial repos Han Young
2024-10-08  8:13   ` [PATCH v2 1/3] repack: pack everything into packfile Han Young
2024-10-08 21:41     ` Calvin Wan
2024-10-08  8:13   ` [PATCH v2 2/3] t0410: adapt tests to repack changes Han Young
2024-10-08  8:13   ` [PATCH v2 3/3] partial-clone: update doc Han Young
2024-10-08 21:57   ` Junio C Hamano [this message]
2024-10-08 22:43     ` [PATCH v2 0/3] repack: pack everything into promisor packfile in partial repos Junio C Hamano
2024-10-09  6:31     ` [External] " Han Young
2024-10-11  8:24 ` [PATCH v3 " Han Young
2024-10-11  8:24   ` [PATCH v3 1/3] repack: pack everything into packfile Han Young
2024-10-11  8:24   ` [PATCH v3 2/3] repack: adapt tests to repack changes Han Young
2024-10-11  8:24   ` [PATCH v3 3/3] partial-clone: update doc Han Young
2024-10-11 18:18   ` [PATCH v3 0/3] repack: pack everything into promisor packfile in partial repos Junio C Hamano
2024-10-11 18:23     ` Junio C Hamano
2024-10-14  3:25 ` [PATCH v4 " Han Young
2024-10-14  3:25   ` [PATCH v4 1/3] repack: pack everything into packfile Han Young
2024-10-14  3:25   ` [PATCH v4 2/3] t0410: adapt tests to repack changes Han Young
2024-10-14  3:25   ` [PATCH v4 3/3] partial-clone: update doc Han Young
2024-10-21 22:29   ` [WIP 0/3] Repack on fetch Jonathan Tan
2024-10-21 22:29     ` [WIP 1/3] move variable Jonathan Tan
2024-10-21 22:29     ` [WIP 2/3] pack-objects Jonathan Tan
2024-10-21 22:29     ` [WIP 3/3] record local links and call pack-objects Jonathan Tan
2024-10-23  7:00     ` [External] [WIP 0/3] Repack on fetch Han Young
2024-10-23 17:03       ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq4j5mz295.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=calvinwan@google.com \
    --cc=git@vger.kernel.org \
    --cc=hanyang.tony@bytedance.com \
    --cc=jonathantanmy@google.com \
    --cc=phillip.wood123@gmail.com \
    --cc=sokcevic@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).