git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood123@gmail.com>
To: Han Young <hanyang.tony@bytedance.com>, git@vger.kernel.org
Cc: calvinwan@google.com, jonathantanmy@google.com,
	sokcevic@google.com, gitster@pobox.com
Subject: Re: [PATCH 0/2] repack: pack everything into promisor packfile in partial repos
Date: Wed, 25 Sep 2024 16:20:55 +0100	[thread overview]
Message-ID: <a5e3322d-4e63-4b8c-84af-6578fe257cad@gmail.com> (raw)
In-Reply-To: <20240925072021.77078-1-hanyang.tony@bytedance.com>

Hi Han

On 25/09/2024 08:20, Han Young wrote:
> As suggested by Jonathan[1], there are number of ways to fix this issue.
> We have already explored some of them in this thread, and so far none of them
> is satisfiable. Calvin and I tried to address the problem from fetch-pack side
> and rev-list side. But the fix either consumes too much CPU power or results
> in inefficient bandwidth use.

I was wondering if it would be possible to cache the tip commits in 
promisor packs when repacking so that a subsequent repack only has to 
walk the commits added since the last repack when it is trying to figure 
out if a local object should be moved into a promisor pack.

> So let's attack the problem from repack side. The goal is to prevent repack
> from discarding local objects, previously it is done by carefully
> separating promisor objects and normal objects in rev-list.
> The implementation is flawed and no solution have been found so far.
> Instead, we can get ride of rev-list and just pack everything into promisor
> files. This way, no objects would be lost.
> 
> By using 'repack everything', repacking requires less work and we are not
> using more bandwidth. The only downside is normal objects packing does not
> benefiting from the history and path based delta calculation.

I've just been looking at Documentation/technical/partial-clone.txt and 
I think there are a couple of other implications of this change

 > An object may be missing due to a partial clone or fetch, or missing
 > due to repository corruption.  To differentiate these cases, the
 > local repository specially indicates such filtered packfiles
 > obtained from promisor remotes as "promisor packfiles".

Packing local objects into promisor packfiles means that it is no longer 
possible to detect if an object is missing due to repository corruption 
or because we need to fetch it from a promisor remote.

 > `repack` in GC has been updated to not touch promisor packfiles at
 > all, and to only repack other objects.

Packing local objects into promisor packfiles means that GC will 
no-longer remove unreachable local objects.

It would be helpful if the cover letter or commit messages discussed the 
tradeoffs of these changes and updated that document accordingly.

Best Wishes

Phillip

> Majority of
> objects in a partial repo is promisor objects, so the impact of worse normal
> objects repacking is negligible.
> 
> [1] https://lore.kernel.org/git/20240813004508.2768102-1-jonathantanmy@google.com/
> 
> Han Young (2):
>    repack: pack everything into promisor packfile in partial repos
>    t0410: adapt tests to repack changes
> 
>   builtin/repack.c         | 258 ++++++++++++++++++++++-----------------
>   t/t0410-partial-clone.sh |  68 +----------
>   2 files changed, 145 insertions(+), 181 deletions(-)
> 

  parent reply	other threads:[~2024-09-25 15:21 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-02  7:31 [PATCH 0/1] revision: fix reachable objects being gc'ed in no blob clone repo Han Young
2024-08-02  7:31 ` [PATCH 1/1] revision: don't set parents as uninteresting if exclude promisor objects Han Young
2024-08-02 16:45   ` Junio C Hamano
2024-08-12 12:34     ` [External] " 韩仰
2024-08-12 16:09       ` Junio C Hamano
2024-08-22  8:28         ` 韩仰
2024-08-13  0:45 ` [PATCH 0/1] revision: fix reachable objects being gc'ed in no blob clone repo Jonathan Tan
2024-08-13 17:18   ` Jonathan Tan
2024-08-14  4:10     ` Junio C Hamano
2024-08-14 19:30       ` Jonathan Tan
2024-08-23 12:43 ` [WIP v2 0/4] " Han Young
2024-08-23 12:43   ` [WIP v2 1/4] packfile: split promisor objects oidset into two Han Young
2024-08-23 12:43   ` [WIP v2 2/4] revision: add exclude-promisor-pack-objects option Han Young
2024-08-23 12:43   ` [WIP v2 3/4] revision: don't mark commit as UNINTERESTING if --exclude-promisor-objects is set Han Young
2024-08-23 12:43   ` [WIP v2 4/4] repack: use new exclude promisor pack objects option Han Young
2024-09-19 23:47 ` [PATCH 0/2] revision: fix reachable commits being gc'ed in partial repo Calvin Wan
2024-09-19 23:47 ` [PATCH 1/2] packfile: split promisor objects oidset into two Calvin Wan
2024-09-22  6:37   ` Junio C Hamano
2024-09-19 23:47 ` [PATCH 2/2] fetch-pack.c: do not declare local commits as "have" in partial repos Calvin Wan
2024-09-22  6:53   ` Junio C Hamano
2024-09-22 16:41     ` Junio C Hamano
2024-09-23  3:44     ` [External] " 韩仰
2024-09-23 16:21       ` Junio C Hamano
2024-10-02 22:35       ` Calvin Wan
2024-09-25  7:20 ` [PATCH 0/2] repack: pack everything into promisor packfile " Han Young
2024-09-25  7:20   ` [PATCH 1/2] repack: pack everything into packfile Han Young
2024-09-25  7:20   ` [PATCH 2/2] t0410: adapt tests to repack changes Han Young
2024-09-25 15:20   ` Phillip Wood [this message]
2024-09-25 16:48     ` [PATCH 0/2] repack: pack everything into promisor packfile in partial repos Junio C Hamano
2024-09-25 17:03   ` Junio C Hamano
2024-10-01 19:17 ` Missing Promisor Objects in Partial Repo Design Doc Calvin Wan
2024-10-01 19:35   ` Junio C Hamano
2024-10-02  2:54   ` Junio C Hamano
2024-10-02  7:57     ` [External] " Han Young
2024-10-08 21:35     ` Calvin Wan
2024-10-09  6:46       ` [External] " Han Young
2024-10-09 18:34         ` Jonathan Tan
2024-10-12  2:05           ` Jonathan Tan
2024-10-12  3:30             ` Han Young
2024-10-14 17:52               ` Jonathan Tan
2024-10-09 18:53     ` Jonathan Tan
2024-10-08  8:13 ` [PATCH v2 0/3] repack: pack everything into promisor packfile in partial repos Han Young
2024-10-08  8:13   ` [PATCH v2 1/3] repack: pack everything into packfile Han Young
2024-10-08 21:41     ` Calvin Wan
2024-10-08  8:13   ` [PATCH v2 2/3] t0410: adapt tests to repack changes Han Young
2024-10-08  8:13   ` [PATCH v2 3/3] partial-clone: update doc Han Young
2024-10-08 21:57   ` [PATCH v2 0/3] repack: pack everything into promisor packfile in partial repos Junio C Hamano
2024-10-08 22:43     ` Junio C Hamano
2024-10-09  6:31     ` [External] " Han Young
2024-10-11  8:24 ` [PATCH v3 " Han Young
2024-10-11  8:24   ` [PATCH v3 1/3] repack: pack everything into packfile Han Young
2024-10-11  8:24   ` [PATCH v3 2/3] repack: adapt tests to repack changes Han Young
2024-10-11  8:24   ` [PATCH v3 3/3] partial-clone: update doc Han Young
2024-10-11 18:18   ` [PATCH v3 0/3] repack: pack everything into promisor packfile in partial repos Junio C Hamano
2024-10-11 18:23     ` Junio C Hamano
2024-10-14  3:25 ` [PATCH v4 " Han Young
2024-10-14  3:25   ` [PATCH v4 1/3] repack: pack everything into packfile Han Young
2024-10-14  3:25   ` [PATCH v4 2/3] t0410: adapt tests to repack changes Han Young
2024-10-14  3:25   ` [PATCH v4 3/3] partial-clone: update doc Han Young
2024-10-21 22:29   ` [WIP 0/3] Repack on fetch Jonathan Tan
2024-10-21 22:29     ` [WIP 1/3] move variable Jonathan Tan
2024-10-21 22:29     ` [WIP 2/3] pack-objects Jonathan Tan
2024-10-21 22:29     ` [WIP 3/3] record local links and call pack-objects Jonathan Tan
2024-10-23  7:00     ` [External] [WIP 0/3] Repack on fetch Han Young
2024-10-23 17:03       ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5e3322d-4e63-4b8c-84af-6578fe257cad@gmail.com \
    --to=phillip.wood123@gmail.com \
    --cc=calvinwan@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hanyang.tony@bytedance.com \
    --cc=jonathantanmy@google.com \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=sokcevic@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).