git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] repack: implement `--cruft-max-size`
@ 2023-09-07 21:51 Taylor Blau
  2023-09-07 21:52 ` [PATCH 1/2] t7700: split cruft-related tests to t7704 Taylor Blau
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Taylor Blau @ 2023-09-07 21:51 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Jonathan Tan

(These patches should be applied on top of a merge with
tb/repack-existing-packs-cleanup, and tb/multi-cruft-pack).

This series attempts to give users some more robust tools for managing
repositories with a large number of unreachable objects by storing them
in separate cruft packs, via a new option `--cruft-max-size`, like so:

    $ git.compile repack -d --cruft --max-pack-size=10M
    [...]
    Enumerating cruft objects: 617483, done.
    Counting objects: 100% (83791/83791), done.
    Delta compression using up to 20 threads
    Compressing objects: 100% (59696/59696), done.
    Writing objects: 100% (83791/83791), done.
    Total 83791 (delta 19251), reused 82502 (delta 19148), pack-reused 0

    $ ls -la .git/objects/pack/pack-*.mtimes
    -r--r--r-- 1 ttaylorr ttaylorr 179144 Sep  7 17:46 .git/objects/pack/pack-1a95260d26f2897abfd2d54f1d58f535acb81d23.mtimes
    -r--r--r-- 1 ttaylorr ttaylorr    452 Sep  7 17:46 .git/objects/pack/pack-5fde8701ae0f2e5553f1fa33de05faf12f94c07f.mtimes
    -r--r--r-- 1 ttaylorr ttaylorr 155720 Sep  7 17:46 .git/objects/pack/pack-91f9e66921e0ebe1b5e35d34842551468cecdc28.mtimes
    -r--r--r-- 1 ttaylorr ttaylorr     56 Sep  7 17:46 .git/objects/pack/pack-95fe626743207b177b45f32b60fdc313e525ea60.mtimes

The details are explained in the second patch, but the gist is that we
will combine cruft packs up until they reach a certain threshold (as
specified by `--cruft-max-size`) and then begin a new "generation" of
cruft packs. That younger generation will grow up until it reaches the
configured threshold, at which point it will become "frozen" and then
any new unreachable objects will be written into a new generation of
cruft packs.

The goal of this series is to reduce I/O churn in repositories that
either (a) have a large number of unreachable objects, (b) rarely prune
them, or (c) both.

Instead of having to rewrite a cruft pack containing every unreachable
object in the repository, we only have to rewrite a cruft pack up until
it reaches the given threshold, at which point it is effectively kept
(i.e., it behaves as if the cruft pack had a ".keep" file tied to it,
provided that the threshold is held constant).

Thanks in advance for your review!

Taylor Blau (2):
  t7700: split cruft-related tests to t7704
  builtin/repack.c: implement support for `--cruft-max-size`

 Documentation/config/gc.txt  |   6 +
 Documentation/git-gc.txt     |   7 +
 Documentation/git-repack.txt |   9 +
 builtin/gc.c                 |   8 +
 builtin/repack.c             | 133 +++++++++++--
 t/t6500-gc.sh                |  27 +++
 t/t7700-repack.sh            | 121 -----------
 t/t7704-repack-cruft.sh      | 375 +++++++++++++++++++++++++++++++++++
 8 files changed, 553 insertions(+), 133 deletions(-)
 create mode 100755 t/t7704-repack-cruft.sh

-- 
2.42.0.138.g7e4e42e1aa

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-10-09 17:28 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-07 21:51 [PATCH 0/2] repack: implement `--cruft-max-size` Taylor Blau
2023-09-07 21:52 ` [PATCH 1/2] t7700: split cruft-related tests to t7704 Taylor Blau
2023-09-08  0:01   ` Eric Sunshine
2023-09-07 21:52 ` [PATCH 2/2] builtin/repack.c: implement support for `--cruft-max-size` Taylor Blau
2023-09-07 23:42   ` Junio C Hamano
2023-09-25 18:01     ` Taylor Blau
2023-09-08 11:21   ` Patrick Steinhardt
2023-10-02 20:30     ` Taylor Blau
2023-10-03  0:44 ` [PATCH v2 0/3] repack: implement `--cruft-max-size` Taylor Blau
2023-10-03  0:44   ` [PATCH v2 1/3] t7700: split cruft-related tests to t7704 Taylor Blau
2023-10-03  0:44   ` [PATCH v2 2/3] builtin/repack.c: parse `--max-pack-size` with OPT_MAGNITUDE Taylor Blau
2023-10-05 11:31     ` Patrick Steinhardt
2023-10-05 17:28       ` Taylor Blau
2023-10-05 20:22         ` Junio C Hamano
2023-10-03  0:44   ` [PATCH v2 3/3] builtin/repack.c: implement support for `--max-cruft-size` Taylor Blau
2023-10-05 12:08     ` Patrick Steinhardt
2023-10-05 17:35       ` Taylor Blau
2023-10-05 20:25       ` Junio C Hamano
2023-10-07 17:20     ` [PATCH] repack: free existing_cruft array after use Jeff King
2023-10-09  1:24       ` Taylor Blau
2023-10-09 17:28         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).