git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tomáš Trnka" <trnka@scm.com>
To: Han Young <hanyang.tony@bytedance.com>
Cc: git@vger.kernel.org
Subject: Re: [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects
Date: Mon, 10 Feb 2025 13:52:40 +0100	[thread overview]
Message-ID: <2289498.vFx2qVVIhK@electra> (raw)
In-Reply-To: <CAG1j3zH1xngk0NZUjHA9Akx526yfEiQ=KsdfyRjE9XAewWV=Sg@mail.gmail.com>

On Monday, 10 February 2025 12:38:17, CET Han Young wrote:
> We repack promisor packs by reading all the objects in promisor packs
> (in repack.c), and send them to pack-objects. pack-objects then write a
> single pack containing all the promisor objects. The actual old promisor
> pack deletion happens in repack.c
> 
> So just simply copying the keep-pack logic to repack_promisor_objects()
> does not prevent the keep promisor packs from being repacked.

I don't know much about the internals so maybe I misinterpreted what I saw, 
but the patch seems to fix the issue I observed:

I have two 40 GiB promisor packs, and as soon as I accumulated 50 additional 
small packs (due to fetches), gc triggered a repack including the two big 
packs, despite them being above the bigPackThreshold so they could be kept. 
Repacking them is very disruptive, both because of the CPU and RAM load this 
produces, and also because this is on btrfs with snapshots, so rewriting those 
80 GiB into a new file means consuming that much extra disk space for no good 
reason.

With my patch, gc did not touch these two big packs but still collected all 
the small ones into one new pack as expected. Everything else also seems to 
work fine.

According to the man page for git-pack-objects, it seems to me that this is 
how it's meant to work, because the description for --keep-pack says "This 
flag causes an object already in the given pack to be ignored, even if it 
would have otherwise been packed." (and something similar for --honor-pack-
keep). To my untrained eyes, it looks like that's also how 
want_found_object()/add_object_entry() in pack-objects.c handle it.

2T



  reply	other threads:[~2025-02-10 13:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-29 10:02 [RFC PATCH resend] builtin/repack: Honor --keep-pack and .keep when repacking promisor objects Tomáš Trnka
2025-01-30  2:26 ` brian m. carlson
2025-01-30  8:09   ` Tomáš Trnka
2025-02-10 11:38 ` [External] " Han Young
2025-02-10 12:52   ` Tomáš Trnka [this message]
2025-02-19 12:55     ` [External] " Han Young
  -- strict thread matches above, loose matches on Subject: below --
2025-01-30  8:11 Tomáš Trnka
2025-01-30 22:32 ` Junio C Hamano
2025-01-31 15:17   ` Tomáš Trnka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2289498.vFx2qVVIhK@electra \
    --to=trnka@scm.com \
    --cc=git@vger.kernel.org \
    --cc=hanyang.tony@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).