git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects
@ 2025-01-29 10:02 Tomáš Trnka
  2025-01-30  2:26 ` brian m. carlson
  2025-02-10 11:38 ` [External] " Han Young
  0 siblings, 2 replies; 7+ messages in thread
From: Tomáš Trnka @ 2025-01-29 10:02 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Junio C Hamano, Jeff King

git-repack currently does not pass --keep-pack or --honor-pack-keep to
the git-pack-objects handling promisor packs. This means that settings
like gc.bigPackThreshold are completely ignored for promisor packs.

The simple fix is to just copy the keep-pack logic into
repack_promisor_objects(), although this could possibly be improved by
making prepare_pack_objects() handle it instead.

Signed-off-by: Tomáš Trnka <trnka@scm.com>
---

RFC: This probably needs a test, but where and how should it be
implemented? Perhaps in t7700-repack.sh, copying one of the tests using
prepare_for_keep_packs and just touching .promisor files? Or instead in
t/t0410-partial-clone.sh using a copy/variant of one of the basic 
repack tests there?

 builtin/repack.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index d6bb37e84a..fe62fe03eb 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -388,15 +388,23 @@ static int has_pack_ext(const struct generated_pack_data 
*data,
 }
 
 static void repack_promisor_objects(const struct pack_objects_args *args,
-				    struct string_list *names)
+				    struct string_list *names,
+				    struct string_list 
*keep_pack_list)
 {
 	struct child_process cmd = CHILD_PROCESS_INIT;
 	FILE *out;
 	struct strbuf line = STRBUF_INIT;
+	int i;
 
 	prepare_pack_objects(&cmd, args, packtmp);
 	cmd.in = -1;
 
+	if (!pack_kept_objects)
+		strvec_push(&cmd.args, "--honor-pack-keep");
+	for (i = 0; i < keep_pack_list->nr; i++)
+		strvec_pushf(&cmd.args, "--keep-pack=%s",
+			     keep_pack_list->items[i].string);
+
 	/*
 	 * NEEDSWORK: Giving pack-objects only the OIDs without any 
ordering
 	 * hints may result in suboptimal deltas in the resulting pack. See 
if
@@ -1350,7 +1358,7 @@ int cmd_repack(int argc,
 		strvec_push(&cmd.args, "--delta-islands");
 
 	if (pack_everything & ALL_INTO_ONE) {
-		repack_promisor_objects(&po_args, &names);
+		repack_promisor_objects(&po_args, &names, 
&keep_pack_list);
 
 		if (has_existing_non_kept_packs(&existing) &&
 		    delete_redundant &&

base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f
-- 
2.47.1





^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects
  2025-01-29 10:02 Tomáš Trnka
@ 2025-01-30  2:26 ` brian m. carlson
  2025-01-30  8:09   ` Tomáš Trnka
  2025-02-10 11:38 ` [External] " Han Young
  1 sibling, 1 reply; 7+ messages in thread
From: brian m. carlson @ 2025-01-30  2:26 UTC (permalink / raw)
  To: Tomáš Trnka; +Cc: git, Taylor Blau, Junio C Hamano, Jeff King

[-- Attachment #1: Type: text/plain, Size: 1105 bytes --]

On 2025-01-29 at 10:02:06, Tomáš Trnka wrote:
> diff --git a/builtin/repack.c b/builtin/repack.c
> index d6bb37e84a..fe62fe03eb 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -388,15 +388,23 @@ static int has_pack_ext(const struct generated_pack_data 
> *data,
>  }
>  
>  static void repack_promisor_objects(const struct pack_objects_args *args,
> -				    struct string_list *names)
> +				    struct string_list *names,
> +				    struct string_list 
> *keep_pack_list)

I don't have a strong opinion about the technical aspects of this patch
(nor sufficient knowledge to review it)[0], but I noticed that there's a
couple of places, this line among them, which are unexpectedly wrapped,
so I don't believe this patch will actually apply.  I noticed that the
email didn't specify an MUA header (or I missed it), so I can't make a
suggestion on how to fix your MUA, but you may want to use `git
send-email` to avoid this problem in the future.

[0] In other words, no need to CC me on a resend.
-- 
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects
  2025-01-30  2:26 ` brian m. carlson
@ 2025-01-30  8:09   ` Tomáš Trnka
  0 siblings, 0 replies; 7+ messages in thread
From: Tomáš Trnka @ 2025-01-30  8:09 UTC (permalink / raw)
  To: brian m. carlson, git; +Cc: Taylor Blau, Junio C Hamano, Jeff King

On Thursday, 30 January 2025 3:26:19, CET brian m. carlson wrote:
> I don't have a strong opinion about the technical aspects of this patch
> (nor sufficient knowledge to review it)[0], but I noticed that there's a
> couple of places, this line among them, which are unexpectedly wrapped,
> so I don't believe this patch will actually apply.

Oops, my sincere apologies to everyone. I'll resend once again.

(I messed up by not realizing that KMail will re-enable line wrapping when re-
sending my original message, which was formatted correctly.)

2T



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects
@ 2025-01-30  8:11 Tomáš Trnka
  2025-01-30 22:32 ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Tomáš Trnka @ 2025-01-30  8:11 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Junio C Hamano, Jeff King

git-repack currently does not pass --keep-pack or --honor-pack-keep to
the git-pack-objects handling promisor packs. This means that settings
like gc.bigPackThreshold are completely ignored for promisor packs.

The simple fix is to just copy the keep-pack logic into
repack_promisor_objects(), although this could possibly be improved by
making prepare_pack_objects() handle it instead.

Signed-off-by: Tomáš Trnka <trnka@scm.com>
---

RFC: This probably needs a test, but where and how should it be
implemented? Perhaps in t7700-repack.sh, copying one of the tests using
prepare_for_keep_packs and just touching .promisor files? Or instead in
t/t0410-partial-clone.sh using a copy/variant of one of the basic 
repack tests there?

 builtin/repack.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index d6bb37e84a..fe62fe03eb 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -388,15 +388,23 @@ static int has_pack_ext(const struct generated_pack_data *data,
 }
 
 static void repack_promisor_objects(const struct pack_objects_args *args,
-				    struct string_list *names)
+				    struct string_list *names,
+				    struct string_list *keep_pack_list)
 {
 	struct child_process cmd = CHILD_PROCESS_INIT;
 	FILE *out;
 	struct strbuf line = STRBUF_INIT;
+	int i;
 
 	prepare_pack_objects(&cmd, args, packtmp);
 	cmd.in = -1;
 
+	if (!pack_kept_objects)
+		strvec_push(&cmd.args, "--honor-pack-keep");
+	for (i = 0; i < keep_pack_list->nr; i++)
+		strvec_pushf(&cmd.args, "--keep-pack=%s",
+			     keep_pack_list->items[i].string);
+
 	/*
 	 * NEEDSWORK: Giving pack-objects only the OIDs without any ordering
 	 * hints may result in suboptimal deltas in the resulting pack. See if
@@ -1350,7 +1358,7 @@ int cmd_repack(int argc,
 		strvec_push(&cmd.args, "--delta-islands");
 
 	if (pack_everything & ALL_INTO_ONE) {
-		repack_promisor_objects(&po_args, &names);
+		repack_promisor_objects(&po_args, &names, &keep_pack_list);
 
 		if (has_existing_non_kept_packs(&existing) &&
 		    delete_redundant &&

base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f
-- 
2.47.1





^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects
  2025-01-30  8:11 [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects Tomáš Trnka
@ 2025-01-30 22:32 ` Junio C Hamano
  2025-01-31 15:17   ` Tomáš Trnka
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2025-01-30 22:32 UTC (permalink / raw)
  To: Tomáš Trnka
  Cc: git, Taylor Blau, Jeff King, Jonathan Tan, hanyang.tony

Tomáš Trnka <trnka@scm.com> writes:

> git-repack currently does not pass --keep-pack or --honor-pack-keep to
> the git-pack-objects handling promisor packs. This means that settings
> like gc.bigPackThreshold are completely ignored for promisor packs.
>
> The simple fix is to just copy the keep-pack logic into
> repack_promisor_objects(), although this could possibly be improved by
> making prepare_pack_objects() handle it instead.
>
> Signed-off-by: Tomáš Trnka <trnka@scm.com>
> ---

I don't have a strong opinion about the technical aspects of this
patch to be a reviewer.  I have no idea how you decided whom to Cc:,
but I recall these two threads that worked in the same "interaction
between repack and promisor objects" area (I am not saying and I do
not think they addressed the same problem as you are):

 * https://lore.kernel.org/git/20240925072021.77078-1-hanyang.tony@bytedance.com/
 * https://lore.kernel.org/git/cover.1730491845.git.jonathantanmy@google.com/

so asking review from the authors of these topics would be more
relevant than sending it to me (I did that already, and that is why
the remainder of the original patch is not culled from this
response---after the next line, there is nothing I wrote but the
original patch left for others' convenience).

Thanks.

>
> RFC: This probably needs a test, but where and how should it be
> implemented? Perhaps in t7700-repack.sh, copying one of the tests using
> prepare_for_keep_packs and just touching .promisor files? Or instead in
> t/t0410-partial-clone.sh using a copy/variant of one of the basic 
> repack tests there?
>
>  builtin/repack.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index d6bb37e84a..fe62fe03eb 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -388,15 +388,23 @@ static int has_pack_ext(const struct generated_pack_data *data,
>  }
>  
>  static void repack_promisor_objects(const struct pack_objects_args *args,
> -				    struct string_list *names)
> +				    struct string_list *names,
> +				    struct string_list *keep_pack_list)
>  {
>  	struct child_process cmd = CHILD_PROCESS_INIT;
>  	FILE *out;
>  	struct strbuf line = STRBUF_INIT;
> +	int i;
>  
>  	prepare_pack_objects(&cmd, args, packtmp);
>  	cmd.in = -1;
>  
> +	if (!pack_kept_objects)
> +		strvec_push(&cmd.args, "--honor-pack-keep");
> +	for (i = 0; i < keep_pack_list->nr; i++)
> +		strvec_pushf(&cmd.args, "--keep-pack=%s",
> +			     keep_pack_list->items[i].string);
> +
>  	/*
>  	 * NEEDSWORK: Giving pack-objects only the OIDs without any ordering
>  	 * hints may result in suboptimal deltas in the resulting pack. See if
> @@ -1350,7 +1358,7 @@ int cmd_repack(int argc,
>  		strvec_push(&cmd.args, "--delta-islands");
>  
>  	if (pack_everything & ALL_INTO_ONE) {
> -		repack_promisor_objects(&po_args, &names);
> +		repack_promisor_objects(&po_args, &names, &keep_pack_list);
>  
>  		if (has_existing_non_kept_packs(&existing) &&
>  		    delete_redundant &&
>
> base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects
  2025-01-30 22:32 ` Junio C Hamano
@ 2025-01-31 15:17   ` Tomáš Trnka
  0 siblings, 0 replies; 7+ messages in thread
From: Tomáš Trnka @ 2025-01-31 15:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Taylor Blau, Jeff King, Jonathan Tan, hanyang.tony

On Thursday, 30 January 2025 23:32:34, CET Junio C Hamano wrote:
> I don't have a strong opinion about the technical aspects of this
> patch to be a reviewer.  I have no idea how you decided whom to Cc:,

I'm sorry again for spamming you unnecessarily and many thanks for suggesting 
more suitable reviewers.

Being a complete newcomer to Git development, I went by what the git-contacts 
tool suggested when I gave it my patch, since its output seemed to roughly 
agree with git log on builtin/repack.c, plus I also thought you might have a 
high-level opinion on where the test belongs. I didn't check the mailing list 
archives too thoroughly, so I was not aware of those discussion threads.

2T



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects
  2025-02-10 11:38 ` [External] " Han Young
@ 2025-02-10 12:52   ` Tomáš Trnka
  0 siblings, 0 replies; 7+ messages in thread
From: Tomáš Trnka @ 2025-02-10 12:52 UTC (permalink / raw)
  To: Han Young; +Cc: git

On Monday, 10 February 2025 12:38:17, CET Han Young wrote:
> We repack promisor packs by reading all the objects in promisor packs
> (in repack.c), and send them to pack-objects. pack-objects then write a
> single pack containing all the promisor objects. The actual old promisor
> pack deletion happens in repack.c
> 
> So just simply copying the keep-pack logic to repack_promisor_objects()
> does not prevent the keep promisor packs from being repacked.

I don't know much about the internals so maybe I misinterpreted what I saw, 
but the patch seems to fix the issue I observed:

I have two 40 GiB promisor packs, and as soon as I accumulated 50 additional 
small packs (due to fetches), gc triggered a repack including the two big 
packs, despite them being above the bigPackThreshold so they could be kept. 
Repacking them is very disruptive, both because of the CPU and RAM load this 
produces, and also because this is on btrfs with snapshots, so rewriting those 
80 GiB into a new file means consuming that much extra disk space for no good 
reason.

With my patch, gc did not touch these two big packs but still collected all 
the small ones into one new pack as expected. Everything else also seems to 
work fine.

According to the man page for git-pack-objects, it seems to me that this is 
how it's meant to work, because the description for --keep-pack says "This 
flag causes an object already in the given pack to be ignored, even if it 
would have otherwise been packed." (and something similar for --honor-pack-
keep). To my untrained eyes, it looks like that's also how 
want_found_object()/add_object_entry() in pack-objects.c handle it.

2T



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-02-10 13:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-30  8:11 [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects Tomáš Trnka
2025-01-30 22:32 ` Junio C Hamano
2025-01-31 15:17   ` Tomáš Trnka
  -- strict thread matches above, loose matches on Subject: below --
2025-01-29 10:02 Tomáš Trnka
2025-01-30  2:26 ` brian m. carlson
2025-01-30  8:09   ` Tomáš Trnka
2025-02-10 11:38 ` [External] " Han Young
2025-02-10 12:52   ` Tomáš Trnka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).