From: Han Young <hanyang.tony@bytedance.com>
To: git@vger.kernel.org
Cc: calvinwan@google.com, jonathantanmy@google.com,
sokcevic@google.com, gitster@pobox.com,
phillip.wood123@gmail.com, Han Young <hanyang.tony@bytedance.com>
Subject: [PATCH v3 1/3] repack: pack everything into packfile
Date: Fri, 11 Oct 2024 16:24:02 +0800 [thread overview]
Message-ID: <20241011082404.88939-2-hanyang.tony@bytedance.com> (raw)
In-Reply-To: <20241011082404.88939-1-hanyang.tony@bytedance.com>
In a partial repository, creating a local commit and then fetching
causes the following state to occur:
commit tree blob
C3 ---- T3 -- B3 (fetched from remote, in promisor pack)
|
C2 ---- T2 -- B2 (created locally, in non-promisor pack)
|
C1 ---- T1 -- B1 (fetched from remote, in promisor pack)
During garbage collection, parents of promisor objects are marked as
UNINTERESTING and are subsequently garbage collected. In this case, C2
would be deleted and attempts to access that commit would result in "bad
object" errors (originally reported here[1]).
For partial repos, repacking is done in two steps. We first repack all the
objects in promisor packfile, then repack all the non-promisor objects.
It turns out C2, T2 and B2 are not repacked in either steps, ended up deleted.
We can avoid this by packing everything into a promisor packfile, if the repo
is partial cloned.
[1] https://lore.kernel.org/git/20240802073143.56731-1-hanyang.tony@bytedance.com/
Helped-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Han Young <hanyang.tony@bytedance.com>
---
builtin/repack.c | 46 +++++++++++++++++++++++++++++++++++++++-------
1 file changed, 39 insertions(+), 7 deletions(-)
diff --git a/builtin/repack.c b/builtin/repack.c
index 79f407ca04..50e14ccfc4 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -343,6 +343,23 @@ static int write_oid(const struct object_id *oid,
return 0;
}
+static int write_loose_oid(const struct object_id *oid,
+ const char *path UNUSED,
+ void *data)
+{
+ struct child_process *cmd = data;
+
+ if (cmd->in == -1) {
+ if (start_command(cmd))
+ die(_("could not start pack-objects to repack promisor objects"));
+ }
+
+ if (write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz) < 0 ||
+ write_in_full(cmd->in, "\n", 1) < 0)
+ die(_("failed to feed promisor objects to pack-objects"));
+ return 0;
+}
+
static struct {
const char *name;
unsigned optional:1;
@@ -392,12 +409,15 @@ static int has_pack_ext(const struct generated_pack_data *data,
BUG("unknown pack extension: '%s'", ext);
}
-static void repack_promisor_objects(const struct pack_objects_args *args,
- struct string_list *names)
+static int repack_promisor_objects(const struct pack_objects_args *args,
+ struct string_list *names,
+ struct string_list *list,
+ int pack_all)
{
struct child_process cmd = CHILD_PROCESS_INIT;
FILE *out;
struct strbuf line = STRBUF_INIT;
+ struct string_list_item *item;
prepare_pack_objects(&cmd, args, packtmp);
cmd.in = -1;
@@ -409,13 +429,19 @@ static void repack_promisor_objects(const struct pack_objects_args *args,
* {type -> existing pack order} ordering when computing deltas instead
* of a {type -> size} ordering, which may produce better deltas.
*/
- for_each_packed_object(write_oid, &cmd,
- FOR_EACH_OBJECT_PROMISOR_ONLY);
+ if (pack_all)
+ for_each_packed_object(write_oid, &cmd, 0);
+ else
+ for_each_string_list_item(item, list) {
+ pack_mark_retained(item);
+ }
+
+ for_each_loose_object(write_loose_oid, &cmd, 0);
if (cmd.in == -1) {
/* No packed objects; cmd was never started */
child_process_clear(&cmd);
- return;
+ return 0;
}
close(cmd.in);
@@ -453,6 +479,7 @@ static void repack_promisor_objects(const struct pack_objects_args *args,
if (finish_command(&cmd))
die(_("could not finish pack-objects to repack promisor objects"));
strbuf_release(&line);
+ return 0;
}
struct pack_geometry {
@@ -1356,9 +1383,13 @@ int cmd_repack(int argc,
if (use_delta_islands)
strvec_push(&cmd.args, "--delta-islands");
- if (pack_everything & ALL_INTO_ONE) {
- repack_promisor_objects(&po_args, &names);
+ if (repo_has_promisor_remote(the_repository)) {
+ ret = repack_promisor_objects(&po_args, &names,
+ &existing.non_kept_packs, pack_everything & ALL_INTO_ONE);
+ goto pack_objects_end;
+ }
+ if (pack_everything & ALL_INTO_ONE) {
if (has_existing_non_kept_packs(&existing) &&
delete_redundant &&
!(pack_everything & PACK_CRUFT)) {
@@ -1478,6 +1509,7 @@ int cmd_repack(int argc,
}
}
+pack_objects_end:
if (po_args.filter_options.choice) {
if (!filter_to)
filter_to = packtmp;
--
2.47.0.266.g0b04b6b485.dirty
next prev parent reply other threads:[~2024-10-11 8:24 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-02 7:31 [PATCH 0/1] revision: fix reachable objects being gc'ed in no blob clone repo Han Young
2024-08-02 7:31 ` [PATCH 1/1] revision: don't set parents as uninteresting if exclude promisor objects Han Young
2024-08-02 16:45 ` Junio C Hamano
2024-08-12 12:34 ` [External] " 韩仰
2024-08-12 16:09 ` Junio C Hamano
2024-08-22 8:28 ` 韩仰
2024-08-13 0:45 ` [PATCH 0/1] revision: fix reachable objects being gc'ed in no blob clone repo Jonathan Tan
2024-08-13 17:18 ` Jonathan Tan
2024-08-14 4:10 ` Junio C Hamano
2024-08-14 19:30 ` Jonathan Tan
2024-08-23 12:43 ` [WIP v2 0/4] " Han Young
2024-08-23 12:43 ` [WIP v2 1/4] packfile: split promisor objects oidset into two Han Young
2024-08-23 12:43 ` [WIP v2 2/4] revision: add exclude-promisor-pack-objects option Han Young
2024-08-23 12:43 ` [WIP v2 3/4] revision: don't mark commit as UNINTERESTING if --exclude-promisor-objects is set Han Young
2024-08-23 12:43 ` [WIP v2 4/4] repack: use new exclude promisor pack objects option Han Young
2024-09-19 23:47 ` [PATCH 0/2] revision: fix reachable commits being gc'ed in partial repo Calvin Wan
2024-09-19 23:47 ` [PATCH 1/2] packfile: split promisor objects oidset into two Calvin Wan
2024-09-22 6:37 ` Junio C Hamano
2024-09-19 23:47 ` [PATCH 2/2] fetch-pack.c: do not declare local commits as "have" in partial repos Calvin Wan
2024-09-22 6:53 ` Junio C Hamano
2024-09-22 16:41 ` Junio C Hamano
2024-09-23 3:44 ` [External] " 韩仰
2024-09-23 16:21 ` Junio C Hamano
2024-10-02 22:35 ` Calvin Wan
2024-09-25 7:20 ` [PATCH 0/2] repack: pack everything into promisor packfile " Han Young
2024-09-25 7:20 ` [PATCH 1/2] repack: pack everything into packfile Han Young
2024-09-25 7:20 ` [PATCH 2/2] t0410: adapt tests to repack changes Han Young
2024-09-25 15:20 ` [PATCH 0/2] repack: pack everything into promisor packfile in partial repos Phillip Wood
2024-09-25 16:48 ` Junio C Hamano
2024-09-25 17:03 ` Junio C Hamano
2024-10-01 19:17 ` Missing Promisor Objects in Partial Repo Design Doc Calvin Wan
2024-10-01 19:35 ` Junio C Hamano
2024-10-02 2:54 ` Junio C Hamano
2024-10-02 7:57 ` [External] " Han Young
2024-10-08 21:35 ` Calvin Wan
2024-10-09 6:46 ` [External] " Han Young
2024-10-09 18:34 ` Jonathan Tan
2024-10-12 2:05 ` Jonathan Tan
2024-10-12 3:30 ` Han Young
2024-10-14 17:52 ` Jonathan Tan
2024-10-09 18:53 ` Jonathan Tan
2024-10-08 8:13 ` [PATCH v2 0/3] repack: pack everything into promisor packfile in partial repos Han Young
2024-10-08 8:13 ` [PATCH v2 1/3] repack: pack everything into packfile Han Young
2024-10-08 21:41 ` Calvin Wan
2024-10-08 8:13 ` [PATCH v2 2/3] t0410: adapt tests to repack changes Han Young
2024-10-08 8:13 ` [PATCH v2 3/3] partial-clone: update doc Han Young
2024-10-08 21:57 ` [PATCH v2 0/3] repack: pack everything into promisor packfile in partial repos Junio C Hamano
2024-10-08 22:43 ` Junio C Hamano
2024-10-09 6:31 ` [External] " Han Young
2024-10-11 8:24 ` [PATCH v3 " Han Young
2024-10-11 8:24 ` Han Young [this message]
2024-10-11 8:24 ` [PATCH v3 2/3] repack: adapt tests to repack changes Han Young
2024-10-11 8:24 ` [PATCH v3 3/3] partial-clone: update doc Han Young
2024-10-11 18:18 ` [PATCH v3 0/3] repack: pack everything into promisor packfile in partial repos Junio C Hamano
2024-10-11 18:23 ` Junio C Hamano
2024-10-14 3:25 ` [PATCH v4 " Han Young
2024-10-14 3:25 ` [PATCH v4 1/3] repack: pack everything into packfile Han Young
2024-10-14 3:25 ` [PATCH v4 2/3] t0410: adapt tests to repack changes Han Young
2024-10-14 3:25 ` [PATCH v4 3/3] partial-clone: update doc Han Young
2024-10-21 22:29 ` [WIP 0/3] Repack on fetch Jonathan Tan
2024-10-21 22:29 ` [WIP 1/3] move variable Jonathan Tan
2024-10-21 22:29 ` [WIP 2/3] pack-objects Jonathan Tan
2024-10-21 22:29 ` [WIP 3/3] record local links and call pack-objects Jonathan Tan
2024-10-23 7:00 ` [External] [WIP 0/3] Repack on fetch Han Young
2024-10-23 17:03 ` Jonathan Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241011082404.88939-2-hanyang.tony@bytedance.com \
--to=hanyang.tony@bytedance.com \
--cc=calvinwan@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=phillip.wood123@gmail.com \
--cc=sokcevic@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).