From: Taylor Blau <me@ttaylorr.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
Date: Mon, 17 Apr 2023 19:03:08 -0400 [thread overview]
Message-ID: <ZD3QLMs8/+DLKZM6@nand.local> (raw)
In-Reply-To: <xmqqildui0gk.fsf@gitster.g>
On Mon, Apr 17, 2023 at 03:54:35PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > - The same is true for `gc.bigPackThreshold`, if the size of the cruft
> > pack exceeds the limit set by the caller.
>
> This is not as cut-and-dried clear as the previous one. "This pack
> is so large that it is not worth rewriting it only to expunge a
> handful of objects that are no longer reachable from it" is the main
> motivation to use this configuration, but doesn't some part of the
> same reasoning apply equally to a large cruft pack? But let's
> assume that the configuration is totally irrelevant to cruft packs
> and read on.
This is an inherent design trade-off. I imagine that callers who want to
avoid rewriting their (large) cruft packs would prefer to generate a new
cruft pack on top with just the recently accumulated unreachable
objects.
That kind of works, except if you need to prune objects that are packed
in an earlier cruft pack. If you have `gc.bigPackThreshold`, there is no
way to do this if you need to expire objects that are in cruft packs
above that threshold.
A user may find themselves frustrated when trying to `git gc --prune`
some sensitive object(s) from their repository doesn't appear to work,
only to discover that `gc.bigPackThreshold` is set somewhere in their
configuration.
Writing (largely) the same cruft pack to expunge a few objects isn't
ideal, but it is better than the status quo. And if you have so many
unreachable objects that this is a concern, it is probably time to prune
anyway.
It is possible that in the future we could support writing multiple
cruft packs (we already handle the presence of multiple cruft packs
fine, just don't expose an easy way for the user to write >1 of them).
And at that point we would be able to relax this patch a bit and allow
`gc.bigPackThreshold` to cover cruft packs, too. But in the meantime,
the benefit of avoiding loose object explosions outweighs the possible
drawbacks here, IMHO.
> > --keep-largest-pack::
> > - All packs except the largest pack and those marked with a
> > - `.keep` files are consolidated into a single pack. When this
> > - option is used, `gc.bigPackThreshold` is ignored.
> > + All packs except the largest pack, any packs marked with a
> > + `.keep` file, and any cruft pack(s) are consolidated into a
> > + single pack. When this option is used, `gc.bigPackThreshold` is
> > + ignored.
>
> "except the largest pack" -> "except the largest, non-cruft pack"
Indeed, good eyes.
Thanks,
Taylor
next prev parent reply other threads:[~2023-04-17 23:03 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
2023-04-17 20:54 ` [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
2023-04-18 10:30 ` Jeff King
2023-04-18 19:40 ` Taylor Blau
2023-04-17 20:54 ` [PATCH 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
2023-04-17 20:54 ` [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
2023-04-17 22:54 ` Junio C Hamano
2023-04-17 23:03 ` Taylor Blau [this message]
2023-04-18 10:39 ` Jeff King
2023-04-18 14:54 ` Derrick Stolee
2023-04-17 20:54 ` [PATCH 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-17 20:54 ` [PATCH 05/10] t/t9300-fast-import.sh: " Taylor Blau
2023-04-18 10:43 ` Jeff King
2023-04-18 19:44 ` Taylor Blau
2023-04-17 20:54 ` [PATCH 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
2023-04-17 20:54 ` [PATCH 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
2023-04-18 10:48 ` Jeff King
2023-04-18 19:48 ` Taylor Blau
2023-04-17 20:54 ` [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 10:56 ` Jeff King
2023-04-18 19:50 ` Taylor Blau
2023-04-22 11:23 ` Jeff King
2023-04-17 20:54 ` [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
2023-04-18 11:00 ` Jeff King
2023-04-18 19:52 ` Taylor Blau
2023-04-17 20:54 ` [PATCH 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
2023-04-18 11:02 ` Jeff King
2023-04-18 11:04 ` [PATCH 00/10] gc: enable cruft packs by default Jeff King
2023-04-18 19:53 ` Taylor Blau
2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
2023-04-18 20:40 ` [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
2023-04-19 22:00 ` Junio C Hamano
2023-04-20 16:31 ` Taylor Blau
2023-04-20 16:57 ` Junio C Hamano
2023-04-18 20:40 ` [PATCH v2 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
2023-04-18 20:40 ` [PATCH v2 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
2023-04-18 20:40 ` [PATCH v2 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 20:40 ` [PATCH v2 05/10] t/t6501-freshen-objects.sh: " Taylor Blau
2023-04-18 20:40 ` [PATCH v2 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
2023-04-18 20:40 ` [PATCH v2 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
2023-04-18 20:40 ` [PATCH v2 08/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 20:40 ` [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
2023-04-19 22:22 ` Junio C Hamano
2023-04-20 17:24 ` Taylor Blau
2023-04-20 17:31 ` Junio C Hamano
2023-04-20 19:19 ` Taylor Blau
2023-04-18 20:41 ` [PATCH v2 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
2023-04-19 22:19 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZD3QLMs8/+DLKZM6@nand.local \
--to=me@ttaylorr.com \
--cc=derrickstolee@github.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.