* [PATCH] gc: add `--expire-to` option @ 2024-12-24 11:52 ZheNing Hu via GitGitGadget 2024-12-31 2:18 ` [PATCH v2 0/2] gc: add --expire-to option ZheNing Hu via GitGitGadget 0 siblings, 1 reply; 13+ messages in thread From: ZheNing Hu via GitGitGadget @ 2024-12-24 11:52 UTC (permalink / raw) To: git; +Cc: gitster, me, ZheNing Hu, ZheNing Hu From: ZheNing Hu <adlternative@gmail.com> This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, this feature was implemented in `git repack` (see 91badeb), allowing users to specify a directory where unreachable and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. By introducing `--expire-to=<dir>` directly into `git gc`, we simplify the process for users who wish to manage their repository's cleanup more efficiently. This change involves passing the `--expire-to=<dir>` parameter through to `git repack`, making it easier for users to set up a backup location for cruft packs that will be pruned. Signed-off-by: ZheNing Hu <adlternative@gmail.com> --- gc: add --expire-to option I want to perform a "safe" garbage collection for the Git repository on the server, which avoids data corruption issues caused by concurrent pushes during git GC. To achieve this, I currently need to use git repack --cruft --expire-to=<dir> and git prune in combination. However, it would be simpler if we could directly use --expire-to=<dir> with the git-gc command. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1843%2Fadlternative%2Fzh%2Fgc-expire-to-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1843/adlternative/zh/gc-expire-to-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/1843 Documentation/git-gc.txt | 6 ++++++ builtin/gc.c | 6 +++++- t/t6500-gc.sh | 6 ++++++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 370e22faaeb..b4c0cf02972 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -69,6 +69,12 @@ be performed as well. the `--max-cruft-size` option of linkgit:git-repack[1] for more. +--expire-to=<dir>:: + When packing unreachable objects into a cruft pack, write a cruft + pack containing pruned objects (if any) to the directory `<dir>`. + See the `--expire-to` option of linkgit:git-repack[1] for + more. + --prune=<date>:: Prune loose objects older than date (default is 2 weeks ago, overridable by the config variable `gc.pruneExpire`). diff --git a/builtin/gc.c b/builtin/gc.c index d52735354c9..77904694c9f 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -136,6 +136,7 @@ struct gc_config { char *prune_worktrees_expire; char *repack_filter; char *repack_filter_to; + char *repack_expire_to; unsigned long big_pack_threshold; unsigned long max_delta_cache_size; }; @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg, if (cfg->max_cruft_size) strvec_pushf(&repack, "--max-cruft-size=%lu", cfg->max_cruft_size); + if (cfg->repack_expire_to) + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); } else { strvec_push(&repack, "-A"); if (cfg->prune_expire) @@ -675,7 +678,6 @@ struct repository *repo UNUSED) const char *prune_expire_sentinel = "sentinel"; const char *prune_expire_arg = prune_expire_sentinel; int ret; - struct option builtin_gc_options[] = { OPT__QUIET(&quiet, N_("suppress progress reporting")), { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), @@ -694,6 +696,8 @@ struct repository *repo UNUSED) PARSE_OPT_NOCOMPLETE), OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, N_("repack all other packs except the largest pack")), + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), + N_("pack prefix to store a pack containing pruned objects")), OPT_END() }; diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index ee074b99b70..d4b0653a9b7 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt ' +test_expect_success '--expire-to sets appropriate repack options' ' + mkdir expired && + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack && + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt +' + run_and_wait_for_gc () { # We read stdout from gc for the side effect of waiting until the # background gc process exits, closing its fd 9. Furthermore, the base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f -- gitgitgadget ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 0/2] gc: add --expire-to option 2024-12-24 11:52 [PATCH] gc: add `--expire-to` option ZheNing Hu via GitGitGadget @ 2024-12-31 2:18 ` ZheNing Hu via GitGitGadget 2024-12-31 2:18 ` [PATCH v2 1/2] gc: add `--expire-to` option ZheNing Hu via GitGitGadget ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: ZheNing Hu via GitGitGadget @ 2024-12-31 2:18 UTC (permalink / raw) To: git; +Cc: gitster, me, ZheNing Hu I want to perform a "safe" garbage collection for the Git repository on the server, which avoids data corruption issues caused by concurrent pushes during git GC. To achieve this, I currently need to use git repack --cruft --expire-to=<dir> and git prune in combination. However, it would be simpler if we could directly use --expire-to=<dir> with the git-gc command. ZheNing Hu (2): gc: add `--expire-to` option fix(gc): make --prune=now compatible with --expire-to Documentation/git-gc.txt | 6 ++++++ builtin/gc.c | 9 +++++++-- t/t6500-gc.sh | 6 ++++++ 3 files changed, 19 insertions(+), 2 deletions(-) base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1843%2Fadlternative%2Fzh%2Fgc-expire-to-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1843/adlternative/zh/gc-expire-to-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/1843 Range-diff vs v1: 1: 14e94bf04e5 = 1: 14e94bf04e5 gc: add `--expire-to` option -: ----------- > 2: 579757957d2 fix(gc): make --prune=now compatible with --expire-to -- gitgitgadget ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/2] gc: add `--expire-to` option 2024-12-31 2:18 ` [PATCH v2 0/2] gc: add --expire-to option ZheNing Hu via GitGitGadget @ 2024-12-31 2:18 ` ZheNing Hu via GitGitGadget 2025-01-03 4:57 ` ZheNing Hu 2025-01-13 7:12 ` ZheNing Hu 2024-12-31 2:18 ` [PATCH v2 2/2] fix(gc): make --prune=now compatible with --expire-to ZheNing Hu via GitGitGadget 2025-01-16 2:35 ` [PATCH v3] gc: add `--expire-to` option ZheNing Hu via GitGitGadget 2 siblings, 2 replies; 13+ messages in thread From: ZheNing Hu via GitGitGadget @ 2024-12-31 2:18 UTC (permalink / raw) To: git; +Cc: gitster, me, ZheNing Hu, ZheNing Hu From: ZheNing Hu <adlternative@gmail.com> This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, this feature was implemented in `git repack` (see 91badeb), allowing users to specify a directory where unreachable and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. By introducing `--expire-to=<dir>` directly into `git gc`, we simplify the process for users who wish to manage their repository's cleanup more efficiently. This change involves passing the `--expire-to=<dir>` parameter through to `git repack`, making it easier for users to set up a backup location for cruft packs that will be pruned. Signed-off-by: ZheNing Hu <adlternative@gmail.com> --- Documentation/git-gc.txt | 6 ++++++ builtin/gc.c | 6 +++++- t/t6500-gc.sh | 6 ++++++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 370e22faaeb..b4c0cf02972 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -69,6 +69,12 @@ be performed as well. the `--max-cruft-size` option of linkgit:git-repack[1] for more. +--expire-to=<dir>:: + When packing unreachable objects into a cruft pack, write a cruft + pack containing pruned objects (if any) to the directory `<dir>`. + See the `--expire-to` option of linkgit:git-repack[1] for + more. + --prune=<date>:: Prune loose objects older than date (default is 2 weeks ago, overridable by the config variable `gc.pruneExpire`). diff --git a/builtin/gc.c b/builtin/gc.c index d52735354c9..77904694c9f 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -136,6 +136,7 @@ struct gc_config { char *prune_worktrees_expire; char *repack_filter; char *repack_filter_to; + char *repack_expire_to; unsigned long big_pack_threshold; unsigned long max_delta_cache_size; }; @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg, if (cfg->max_cruft_size) strvec_pushf(&repack, "--max-cruft-size=%lu", cfg->max_cruft_size); + if (cfg->repack_expire_to) + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); } else { strvec_push(&repack, "-A"); if (cfg->prune_expire) @@ -675,7 +678,6 @@ struct repository *repo UNUSED) const char *prune_expire_sentinel = "sentinel"; const char *prune_expire_arg = prune_expire_sentinel; int ret; - struct option builtin_gc_options[] = { OPT__QUIET(&quiet, N_("suppress progress reporting")), { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), @@ -694,6 +696,8 @@ struct repository *repo UNUSED) PARSE_OPT_NOCOMPLETE), OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, N_("repack all other packs except the largest pack")), + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), + N_("pack prefix to store a pack containing pruned objects")), OPT_END() }; diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index ee074b99b70..d4b0653a9b7 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt ' +test_expect_success '--expire-to sets appropriate repack options' ' + mkdir expired && + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack && + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt +' + run_and_wait_for_gc () { # We read stdout from gc for the side effect of waiting until the # background gc process exits, closing its fd 9. Furthermore, the -- gitgitgadget ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] gc: add `--expire-to` option 2024-12-31 2:18 ` [PATCH v2 1/2] gc: add `--expire-to` option ZheNing Hu via GitGitGadget @ 2025-01-03 4:57 ` ZheNing Hu 2025-01-13 7:12 ` ZheNing Hu 1 sibling, 0 replies; 13+ messages in thread From: ZheNing Hu @ 2025-01-03 4:57 UTC (permalink / raw) To: Jeff King; +Cc: git, gitster, me ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> 于2024年12月31日周二 10:18写道: > > From: ZheNing Hu <adlternative@gmail.com> > > This commit extends the functionality of `git gc` > by adding a new option, `--expire-to=<dir>`. Previously, > this feature was implemented in `git repack` (see 91badeb), > allowing users to specify a directory where unreachable and > expired cruft packs are stored during garbage collection. > However, users had to run `git repack --cruft --expire-to=<dir>` > followed by `git prune` to achieve similar results within `git gc`. > > By introducing `--expire-to=<dir>` directly into `git gc`, > we simplify the process for users who wish to manage their > repository's cleanup more efficiently. This change involves > passing the `--expire-to=<dir>` parameter through to `git repack`, > making it easier for users to set up a backup location for cruft > packs that will be pruned. > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > --- > Documentation/git-gc.txt | 6 ++++++ > builtin/gc.c | 6 +++++- > t/t6500-gc.sh | 6 ++++++ > 3 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt > index 370e22faaeb..b4c0cf02972 100644 > --- a/Documentation/git-gc.txt > +++ b/Documentation/git-gc.txt > @@ -69,6 +69,12 @@ be performed as well. > the `--max-cruft-size` option of linkgit:git-repack[1] for > more. > > +--expire-to=<dir>:: > + When packing unreachable objects into a cruft pack, write a cruft > + pack containing pruned objects (if any) to the directory `<dir>`. > + See the `--expire-to` option of linkgit:git-repack[1] for > + more. > + > --prune=<date>:: > Prune loose objects older than date (default is 2 weeks ago, > overridable by the config variable `gc.pruneExpire`). > diff --git a/builtin/gc.c b/builtin/gc.c > index d52735354c9..77904694c9f 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -136,6 +136,7 @@ struct gc_config { > char *prune_worktrees_expire; > char *repack_filter; > char *repack_filter_to; > + char *repack_expire_to; > unsigned long big_pack_threshold; > unsigned long max_delta_cache_size; > }; > @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg, > if (cfg->max_cruft_size) > strvec_pushf(&repack, "--max-cruft-size=%lu", > cfg->max_cruft_size); > + if (cfg->repack_expire_to) > + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); > } else { > strvec_push(&repack, "-A"); > if (cfg->prune_expire) > @@ -675,7 +678,6 @@ struct repository *repo UNUSED) > const char *prune_expire_sentinel = "sentinel"; > const char *prune_expire_arg = prune_expire_sentinel; > int ret; > - > struct option builtin_gc_options[] = { > OPT__QUIET(&quiet, N_("suppress progress reporting")), > { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), > @@ -694,6 +696,8 @@ struct repository *repo UNUSED) > PARSE_OPT_NOCOMPLETE), > OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, > N_("repack all other packs except the largest pack")), > + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), > + N_("pack prefix to store a pack containing pruned objects")), > OPT_END() > }; > > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index ee074b99b70..d4b0653a9b7 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' > test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt > ' > > +test_expect_success '--expire-to sets appropriate repack options' ' > + mkdir expired && > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack && > + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt > +' > + > run_and_wait_for_gc () { > # We read stdout from gc for the side effect of waiting until the > # background gc process exits, closing its fd 9. Furthermore, the > -- > gitgitgadget > Hi, Jeff King, could you come and help take a look at this patch? I would be very grateful if you have time! ZheNing Hu ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] gc: add `--expire-to` option 2024-12-31 2:18 ` [PATCH v2 1/2] gc: add `--expire-to` option ZheNing Hu via GitGitGadget 2025-01-03 4:57 ` ZheNing Hu @ 2025-01-13 7:12 ` ZheNing Hu 1 sibling, 0 replies; 13+ messages in thread From: ZheNing Hu @ 2025-01-13 7:12 UTC (permalink / raw) To: Git List; +Cc: gitster, me, Jeff King This patch has been sitting for weeks with no review. Does anyone want to help take a look? ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> 于2024年12月31日周二 10:18写道: > > From: ZheNing Hu <adlternative@gmail.com> > > This commit extends the functionality of `git gc` > by adding a new option, `--expire-to=<dir>`. Previously, > this feature was implemented in `git repack` (see 91badeb), > allowing users to specify a directory where unreachable and > expired cruft packs are stored during garbage collection. > However, users had to run `git repack --cruft --expire-to=<dir>` > followed by `git prune` to achieve similar results within `git gc`. > > By introducing `--expire-to=<dir>` directly into `git gc`, > we simplify the process for users who wish to manage their > repository's cleanup more efficiently. This change involves > passing the `--expire-to=<dir>` parameter through to `git repack`, > making it easier for users to set up a backup location for cruft > packs that will be pruned. > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > --- > Documentation/git-gc.txt | 6 ++++++ > builtin/gc.c | 6 +++++- > t/t6500-gc.sh | 6 ++++++ > 3 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt > index 370e22faaeb..b4c0cf02972 100644 > --- a/Documentation/git-gc.txt > +++ b/Documentation/git-gc.txt > @@ -69,6 +69,12 @@ be performed as well. > the `--max-cruft-size` option of linkgit:git-repack[1] for > more. > > +--expire-to=<dir>:: > + When packing unreachable objects into a cruft pack, write a cruft > + pack containing pruned objects (if any) to the directory `<dir>`. > + See the `--expire-to` option of linkgit:git-repack[1] for > + more. > + > --prune=<date>:: > Prune loose objects older than date (default is 2 weeks ago, > overridable by the config variable `gc.pruneExpire`). > diff --git a/builtin/gc.c b/builtin/gc.c > index d52735354c9..77904694c9f 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -136,6 +136,7 @@ struct gc_config { > char *prune_worktrees_expire; > char *repack_filter; > char *repack_filter_to; > + char *repack_expire_to; > unsigned long big_pack_threshold; > unsigned long max_delta_cache_size; > }; > @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg, > if (cfg->max_cruft_size) > strvec_pushf(&repack, "--max-cruft-size=%lu", > cfg->max_cruft_size); > + if (cfg->repack_expire_to) > + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); > } else { > strvec_push(&repack, "-A"); > if (cfg->prune_expire) > @@ -675,7 +678,6 @@ struct repository *repo UNUSED) > const char *prune_expire_sentinel = "sentinel"; > const char *prune_expire_arg = prune_expire_sentinel; > int ret; > - > struct option builtin_gc_options[] = { > OPT__QUIET(&quiet, N_("suppress progress reporting")), > { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), > @@ -694,6 +696,8 @@ struct repository *repo UNUSED) > PARSE_OPT_NOCOMPLETE), > OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, > N_("repack all other packs except the largest pack")), > + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), > + N_("pack prefix to store a pack containing pruned objects")), > OPT_END() > }; > > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index ee074b99b70..d4b0653a9b7 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' > test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt > ' > > +test_expect_success '--expire-to sets appropriate repack options' ' > + mkdir expired && > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack && > + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt > +' > + > run_and_wait_for_gc () { > # We read stdout from gc for the side effect of waiting until the > # background gc process exits, closing its fd 9. Furthermore, the > -- > gitgitgadget > ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 2/2] fix(gc): make --prune=now compatible with --expire-to 2024-12-31 2:18 ` [PATCH v2 0/2] gc: add --expire-to option ZheNing Hu via GitGitGadget 2024-12-31 2:18 ` [PATCH v2 1/2] gc: add `--expire-to` option ZheNing Hu via GitGitGadget @ 2024-12-31 2:18 ` ZheNing Hu via GitGitGadget 2025-01-13 9:17 ` Jeff King 2025-01-16 2:35 ` [PATCH v3] gc: add `--expire-to` option ZheNing Hu via GitGitGadget 2 siblings, 1 reply; 13+ messages in thread From: ZheNing Hu via GitGitGadget @ 2024-12-31 2:18 UTC (permalink / raw) To: git; +Cc: gitster, me, ZheNing Hu, ZheNing Hu From: ZheNing Hu <adlternative@gmail.com> The original `git gc --prune=now` attempted to delete all unreachable objects. However, after the introduction of `--cruft` and `--expire-to=<dir>` in git gc, `--prune=now` can now compress unreachable objects into a cruft pack and store them in the specified <dir> instead of deleting them directly. This is beneficial for recovery in case of data corruption during repository GC. Therefore, update the handling logic of `--prune=now` in gc so that `-a` parameter is only passed to the repack command when neither `--cruft` nor `--expire-to` are used. Signed-off-by: ZheNing Hu <adlternative@gmail.com> --- builtin/gc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/builtin/gc.c b/builtin/gc.c index 77904694c9f..8656e1caff0 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) static void add_repack_all_option(struct gc_config *cfg, struct string_list *keep_pack) { - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") + && !(cfg->cruft_packs && cfg->repack_expire_to)) strvec_push(&repack, "-a"); else if (cfg->cruft_packs) { strvec_push(&repack, "--cruft"); -- gitgitgadget ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] fix(gc): make --prune=now compatible with --expire-to 2024-12-31 2:18 ` [PATCH v2 2/2] fix(gc): make --prune=now compatible with --expire-to ZheNing Hu via GitGitGadget @ 2025-01-13 9:17 ` Jeff King 2025-01-15 7:56 ` ZheNing Hu 0 siblings, 1 reply; 13+ messages in thread From: Jeff King @ 2025-01-13 9:17 UTC (permalink / raw) To: ZheNing Hu via GitGitGadget; +Cc: git, gitster, me, ZheNing Hu On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote: > diff --git a/builtin/gc.c b/builtin/gc.c > index 77904694c9f..8656e1caff0 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) > static void add_repack_all_option(struct gc_config *cfg, > struct string_list *keep_pack) > { > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") > + && !(cfg->cruft_packs && cfg->repack_expire_to)) > strvec_push(&repack, "-a"); I expected to see a mention of repack_expire_to here, but not cfg->cruft_packs. These two are AND-ed together so we are only disabling "repack -a" when both options ("--expire-to" and "--cruft") are passed. Can we --expire-to without cruft? I.e., what should happen with: git gc --expire-to=some-path --prune=now --no-cruft Looking at the underlying git-repack, it seems that we only respect --expire-to at all when used with "--cruft", and don't otherwise consider it. Which is what the manpage says ("Only useful with --cruft -d"). But if we look at this proposed patch for example: https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/ then it is expanding how --expire-to is used during the pruning step. OTOH, I think the way your patch 1 is structured means that we'd always pass --expire-to to git-repack anyway, and I _think_ even with the patch linked above that "repack -a -d --expire-to=whatever" would do the right thing. In which case the problem really is the combination of cruft packs and expire-to. Just cruft packs by themselves do not need to override using "-a" for "--prune=now" because we know that any such cruft pack would be empty. So I think this logic is correct. Taylor might have more thoughts, though (and ideas on whether he intends to revisit that earlier patch). I do think this change should probably be done as part of patch 1, rather than introducing a buggy state and then fixing it in patch 2. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] fix(gc): make --prune=now compatible with --expire-to 2025-01-13 9:17 ` Jeff King @ 2025-01-15 7:56 ` ZheNing Hu 0 siblings, 0 replies; 13+ messages in thread From: ZheNing Hu @ 2025-01-15 7:56 UTC (permalink / raw) To: Jeff King; +Cc: ZheNing Hu via GitGitGadget, git, gitster, me Jeff King <peff@peff.net> 于2025年1月13日周一 17:17写道: > > On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote: > > > diff --git a/builtin/gc.c b/builtin/gc.c > > index 77904694c9f..8656e1caff0 100644 > > --- a/builtin/gc.c > > +++ b/builtin/gc.c > > @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) > > static void add_repack_all_option(struct gc_config *cfg, > > struct string_list *keep_pack) > > { > > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) > > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") > > + && !(cfg->cruft_packs && cfg->repack_expire_to)) > > strvec_push(&repack, "-a"); > > I expected to see a mention of repack_expire_to here, but not > cfg->cruft_packs. These two are AND-ed together so we are only disabling > "repack -a" when both options ("--expire-to" and "--cruft") are passed. > Can we --expire-to without cruft? I.e., what should happen with: > > git gc --expire-to=some-path --prune=now --no-cruft > > Looking at the underlying git-repack, it seems that we only respect > --expire-to at all when used with "--cruft", and don't otherwise > consider it. Which is what the manpage says ("Only useful with --cruft > -d"). > Yes, this is the current state of git-repack. The --expire-to option can only be used with --cruft, which is why I use cruft_packs && repack_expire_to as a double safeguard. When using --no-cruft, the option --expire-to becomes irrelevant. So leaving `git gc --prune=now` as is at this point: passing -a as a parameter to repack seems reasonable. > But if we look at this proposed patch for example: > > https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/ > > then it is expanding how --expire-to is used during the pruning step. > OTOH, I think the way your patch 1 is structured means that we'd always > pass --expire-to to git-repack anyway, and I _think_ even with the patch > linked above that "repack -a -d --expire-to=whatever" would do the right > thing. > I've taken a look at the patch, and I believe Taylor's changes are primarily aimed at extending the --expire-to functionality within the --cruft feature, rather than expecting --expire-to to be used on its own. > In which case the problem really is the combination of cruft packs and > expire-to. Just cruft packs by themselves do not need to override using > "-a" for "--prune=now" because we know that any such cruft pack would be > empty. > > So I think this logic is correct. Taylor might have more thoughts, > though (and ideas on whether he intends to revisit that earlier patch). > > I do think this change should probably be done as part of patch 1, > rather than introducing a buggy state and then fixing it in patch 2. > Yes, I agree with that, and perhaps a single patch will suffice. > -Peff - ZheNing Hu ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3] gc: add `--expire-to` option 2024-12-31 2:18 ` [PATCH v2 0/2] gc: add --expire-to option ZheNing Hu via GitGitGadget 2024-12-31 2:18 ` [PATCH v2 1/2] gc: add `--expire-to` option ZheNing Hu via GitGitGadget 2024-12-31 2:18 ` [PATCH v2 2/2] fix(gc): make --prune=now compatible with --expire-to ZheNing Hu via GitGitGadget @ 2025-01-16 2:35 ` ZheNing Hu via GitGitGadget 2025-01-16 18:23 ` Junio C Hamano 2025-01-24 7:49 ` [PATCH v4] " ZheNing Hu via GitGitGadget 2 siblings, 2 replies; 13+ messages in thread From: ZheNing Hu via GitGitGadget @ 2025-01-16 2:35 UTC (permalink / raw) To: git; +Cc: gitster, me, peff, ZheNing Hu, ZheNing Hu From: ZheNing Hu <adlternative@gmail.com> This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, this feature was implemented in `git repack` (see 91badeb), allowing users to specify a directory where unreachable and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. By introducing `--expire-to=<dir>` directly into `git gc`, we simplify the process for users who wish to manage their repository's cleanup more efficiently. This change involves passing the `--expire-to=<dir>` parameter through to `git repack`, making it easier for users to set up a backup location for cruft packs that will be pruned. Note: When git-gc is used with both `--cruft` and `--expire-to`, it does not pass `-a` to git-repack to delete all unreachable objects as `git gc --prune=now` originally did. Instead, it generates a cruft pack in the directory specified by expire-to. Signed-off-by: ZheNing Hu <adlternative@gmail.com> --- gc: add --expire-to option I want to perform a "safe" garbage collection for the Git repository on the server, which avoids data corruption issues caused by concurrent pushes during git GC. To achieve this, I currently need to use git repack --cruft --expire-to=<dir> and git prune in combination. However, it would be simpler if we could directly use --expire-to=<dir> with the git-gc command. v1: add --expire-to option to gc v1 -> v2: fix git gc --prune=now with --expire-to v2 -> v3: squash two patch into one patch Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1843%2Fadlternative%2Fzh%2Fgc-expire-to-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1843/adlternative/zh/gc-expire-to-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/1843 Range-diff vs v2: 1: 14e94bf04e5 ! 1: 0842ec34948 gc: add `--expire-to` option @@ Commit message making it easier for users to set up a backup location for cruft packs that will be pruned. + Note: When git-gc is used with both `--cruft` and `--expire-to`, + it does not pass `-a` to git-repack to delete all unreachable + objects as `git gc --prune=now` originally did. Instead, it + generates a cruft pack in the directory specified by expire-to. + Signed-off-by: ZheNing Hu <adlternative@gmail.com> ## Documentation/git-gc.txt ## @@ builtin/gc.c: struct gc_config { unsigned long big_pack_threshold; unsigned long max_delta_cache_size; }; +@@ builtin/gc.c: static int keep_one_pack(struct string_list_item *item, void *data UNUSED) + static void add_repack_all_option(struct gc_config *cfg, + struct string_list *keep_pack) + { +- if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) ++ if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") ++ && !(cfg->cruft_packs && cfg->repack_expire_to)) + strvec_push(&repack, "-a"); + else if (cfg->cruft_packs) { + strvec_push(&repack, "--cruft"); @@ builtin/gc.c: static void add_repack_all_option(struct gc_config *cfg, if (cfg->max_cruft_size) strvec_pushf(&repack, "--max-cruft-size=%lu", 2: 579757957d2 < -: ----------- fix(gc): make --prune=now compatible with --expire-to Documentation/git-gc.txt | 6 ++++++ builtin/gc.c | 9 +++++++-- t/t6500-gc.sh | 6 ++++++ 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 370e22faaeb..b4c0cf02972 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -69,6 +69,12 @@ be performed as well. the `--max-cruft-size` option of linkgit:git-repack[1] for more. +--expire-to=<dir>:: + When packing unreachable objects into a cruft pack, write a cruft + pack containing pruned objects (if any) to the directory `<dir>`. + See the `--expire-to` option of linkgit:git-repack[1] for + more. + --prune=<date>:: Prune loose objects older than date (default is 2 weeks ago, overridable by the config variable `gc.pruneExpire`). diff --git a/builtin/gc.c b/builtin/gc.c index d52735354c9..8656e1caff0 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -136,6 +136,7 @@ struct gc_config { char *prune_worktrees_expire; char *repack_filter; char *repack_filter_to; + char *repack_expire_to; unsigned long big_pack_threshold; unsigned long max_delta_cache_size; }; @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) static void add_repack_all_option(struct gc_config *cfg, struct string_list *keep_pack) { - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") + && !(cfg->cruft_packs && cfg->repack_expire_to)) strvec_push(&repack, "-a"); else if (cfg->cruft_packs) { strvec_push(&repack, "--cruft"); @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg, if (cfg->max_cruft_size) strvec_pushf(&repack, "--max-cruft-size=%lu", cfg->max_cruft_size); + if (cfg->repack_expire_to) + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); } else { strvec_push(&repack, "-A"); if (cfg->prune_expire) @@ -675,7 +679,6 @@ struct repository *repo UNUSED) const char *prune_expire_sentinel = "sentinel"; const char *prune_expire_arg = prune_expire_sentinel; int ret; - struct option builtin_gc_options[] = { OPT__QUIET(&quiet, N_("suppress progress reporting")), { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), @@ -694,6 +697,8 @@ struct repository *repo UNUSED) PARSE_OPT_NOCOMPLETE), OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, N_("repack all other packs except the largest pack")), + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), + N_("pack prefix to store a pack containing pruned objects")), OPT_END() }; diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index ee074b99b70..d4b0653a9b7 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt ' +test_expect_success '--expire-to sets appropriate repack options' ' + mkdir expired && + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack && + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt +' + run_and_wait_for_gc () { # We read stdout from gc for the side effect of waiting until the # background gc process exits, closing its fd 9. Furthermore, the base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f -- gitgitgadget ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3] gc: add `--expire-to` option 2025-01-16 2:35 ` [PATCH v3] gc: add `--expire-to` option ZheNing Hu via GitGitGadget @ 2025-01-16 18:23 ` Junio C Hamano 2025-01-23 3:42 ` ZheNing Hu 2025-01-24 7:49 ` [PATCH v4] " ZheNing Hu via GitGitGadget 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2025-01-16 18:23 UTC (permalink / raw) To: ZheNing Hu via GitGitGadget; +Cc: git, me, peff, ZheNing Hu "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: ZheNing Hu <adlternative@gmail.com> > > This commit extends the functionality of `git gc` > by adding a new option, `--expire-to=<dir>`. Previously, > this feature was implemented in `git repack` (see 91badeb), > allowing users to specify a directory where unreachable and > expired cruft packs are stored during garbage collection. > However, users had to run `git repack --cruft --expire-to=<dir>` > followed by `git prune` to achieve similar results within `git gc`. > > By introducing `--expire-to=<dir>` directly into `git gc`, > we simplify the process for users who wish to manage their > repository's cleanup more efficiently. This change involves > passing the `--expire-to=<dir>` parameter through to `git repack`, > making it easier for users to set up a backup location for cruft > packs that will be pruned. Today I do not have enough time to do my usual commit log message critique. Please use "git show -s --format=reference" when referring to an earlier commit. > Note: When git-gc is used with both `--cruft` and `--expire-to`, > it does not pass `-a` to git-repack to delete all unreachable > objects as `git gc --prune=now` originally did. Instead, it > generates a cruft pack in the directory specified by expire-to. Is this less important than "we added --expire-to to gc that is passed down to underlying repack" in the previous paragraph? Not removing the unreachables too early with "repack -a" is an essential part of the design of this new feature to allow us not to lose the cruft objects, so I was a bit surprised that this was described as a "Note:". > diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt > index 370e22faaeb..b4c0cf02972 100644 > --- a/Documentation/git-gc.txt > +++ b/Documentation/git-gc.txt > @@ -69,6 +69,12 @@ be performed as well. > the `--max-cruft-size` option of linkgit:git-repack[1] for > more. > > +--expire-to=<dir>:: > + When packing unreachable objects into a cruft pack, write a cruft > + pack containing pruned objects (if any) to the directory `<dir>`. > + See the `--expire-to` option of linkgit:git-repack[1] for > + more. Does "When packing unreachable objects into a cruft pack" mean that this option is only meaningful with "--cruft"? As "--cruft" is on by default, is it an error to pass "--no-cruft" when you use this option? "for more" -> "for more information" or something? > diff --git a/builtin/gc.c b/builtin/gc.c > index d52735354c9..8656e1caff0 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -136,6 +136,7 @@ struct gc_config { > char *prune_worktrees_expire; > char *repack_filter; > char *repack_filter_to; > + char *repack_expire_to; > unsigned long big_pack_threshold; > unsigned long max_delta_cache_size; > }; > @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) > static void add_repack_all_option(struct gc_config *cfg, > struct string_list *keep_pack) > { > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") > + && !(cfg->cruft_packs && cfg->repack_expire_to)) > strvec_push(&repack, "-a"); Hmph. When "--expire-to=<there>" is given, we are dropping these unreachable objects right away, but we said "--no-cruft", then we say "repack -a". If we have both "--cruft" and "--expire-to=<there>", then ... > else if (cfg->cruft_packs) { > strvec_push(&repack, "--cruft"); > @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg, > if (cfg->max_cruft_size) > strvec_pushf(&repack, "--max-cruft-size=%lu", > cfg->max_cruft_size); > + if (cfg->repack_expire_to) > + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); ... we do the usual "repack --cruft --expire-to=<there>" in the next block. > @@ -675,7 +679,6 @@ struct repository *repo UNUSED) > const char *prune_expire_sentinel = "sentinel"; > const char *prune_expire_arg = prune_expire_sentinel; > int ret; > - > struct option builtin_gc_options[] = { > OPT__QUIET(&quiet, N_("suppress progress reporting")), > { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), OK. > @@ -694,6 +697,8 @@ struct repository *repo UNUSED) > PARSE_OPT_NOCOMPLETE), > OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, > N_("repack all other packs except the largest pack")), > + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), > + N_("pack prefix to store a pack containing pruned objects")), > OPT_END() > }; OK. > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index ee074b99b70..d4b0653a9b7 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' > test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt > ' > > +test_expect_success '--expire-to sets appropriate repack options' ' > + mkdir expired && > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack && > + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt > +' As "--cruft" is on by default, the command line does not have to have it, but being explicit is good. Should we also see what happens when "--no-cruft" is given? Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] gc: add `--expire-to` option 2025-01-16 18:23 ` Junio C Hamano @ 2025-01-23 3:42 ` ZheNing Hu 0 siblings, 0 replies; 13+ messages in thread From: ZheNing Hu @ 2025-01-23 3:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: ZheNing Hu via GitGitGadget, git, me, peff Junio C Hamano <gitster@pobox.com> 于2025年1月17日周五 02:23写道: > > "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > From: ZheNing Hu <adlternative@gmail.com> > > > > This commit extends the functionality of `git gc` > > by adding a new option, `--expire-to=<dir>`. Previously, > > this feature was implemented in `git repack` (see 91badeb), > > allowing users to specify a directory where unreachable and > > expired cruft packs are stored during garbage collection. > > However, users had to run `git repack --cruft --expire-to=<dir>` > > followed by `git prune` to achieve similar results within `git gc`. > > > > By introducing `--expire-to=<dir>` directly into `git gc`, > > we simplify the process for users who wish to manage their > > repository's cleanup more efficiently. This change involves > > passing the `--expire-to=<dir>` parameter through to `git repack`, > > making it easier for users to set up a backup location for cruft > > packs that will be pruned. > > Today I do not have enough time to do my usual commit log message > critique. Please use "git show -s --format=reference" when > referring to an earlier commit. > Okay, I will change to using this format. > > Note: When git-gc is used with both `--cruft` and `--expire-to`, > > it does not pass `-a` to git-repack to delete all unreachable > > objects as `git gc --prune=now` originally did. Instead, it > > generates a cruft pack in the directory specified by expire-to. > > Is this less important than "we added --expire-to to gc that is > passed down to underlying repack" in the previous paragraph? > I had thought that adding --expire-to to gc was key in this patch, but the change to the implementation of --prune=now should indeed be mentioned more. > Not removing the unreachables too early with "repack -a" is an > essential part of the design of this new feature to allow us not to > lose the cruft objects, so I was a bit surprised that this was > described as a "Note:". > You're right. This section shouldn't use a note; it should provide a more detailed explanation instead. > > diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt > > index 370e22faaeb..b4c0cf02972 100644 > > --- a/Documentation/git-gc.txt > > +++ b/Documentation/git-gc.txt > > @@ -69,6 +69,12 @@ be performed as well. > > the `--max-cruft-size` option of linkgit:git-repack[1] for > > more. > > > > +--expire-to=<dir>:: > > + When packing unreachable objects into a cruft pack, write a cruft > > + pack containing pruned objects (if any) to the directory `<dir>`. > > + See the `--expire-to` option of linkgit:git-repack[1] for > > + more. > > Does "When packing unreachable objects into a cruft pack" mean that > this option is only meaningful with "--cruft"? As "--cruft" is on > by default, is it an error to pass "--no-cruft" when you use this > option? > It (--expired-to) can currently only be used together with --cruft. Using --no-cruft together with --expire-to will not result in an error, but --expired-to will not take effect either. I should mention in the document that --expire-to and --cruft need to be used together, otherwise --expire-to will not have any effect. > "for more" -> "for more information" or something? > OK, "for more information". > > diff --git a/builtin/gc.c b/builtin/gc.c > > index d52735354c9..8656e1caff0 100644 > > --- a/builtin/gc.c > > +++ b/builtin/gc.c > > @@ -136,6 +136,7 @@ struct gc_config { > > char *prune_worktrees_expire; > > char *repack_filter; > > char *repack_filter_to; > > + char *repack_expire_to; > > unsigned long big_pack_threshold; > > unsigned long max_delta_cache_size; > > }; > > @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) > > static void add_repack_all_option(struct gc_config *cfg, > > struct string_list *keep_pack) > > { > > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) > > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") > > + && !(cfg->cruft_packs && cfg->repack_expire_to)) > > strvec_push(&repack, "-a"); > > Hmph. When "--expire-to=<there>" is given, we are dropping these > unreachable objects right away, but we said "--no-cruft", then we > say "repack -a". If we have both "--cruft" and "--expire-to=<there>", > then ... > > > else if (cfg->cruft_packs) { > > strvec_push(&repack, "--cruft"); > > @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg, > > if (cfg->max_cruft_size) > > strvec_pushf(&repack, "--max-cruft-size=%lu", > > cfg->max_cruft_size); > > + if (cfg->repack_expire_to) > > + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); > > ... we do the usual "repack --cruft --expire-to=<there>" in the next > block. > > > @@ -675,7 +679,6 @@ struct repository *repo UNUSED) > > const char *prune_expire_sentinel = "sentinel"; > > const char *prune_expire_arg = prune_expire_sentinel; > > int ret; > > - > > struct option builtin_gc_options[] = { > > OPT__QUIET(&quiet, N_("suppress progress reporting")), > > { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), > > OK. > > > @@ -694,6 +697,8 @@ struct repository *repo UNUSED) > > PARSE_OPT_NOCOMPLETE), > > OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, > > N_("repack all other packs except the largest pack")), > > + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), > > + N_("pack prefix to store a pack containing pruned objects")), > > OPT_END() > > }; > > OK. > > > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > > index ee074b99b70..d4b0653a9b7 100755 > > --- a/t/t6500-gc.sh > > +++ b/t/t6500-gc.sh > > @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' > > test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt > > ' > > > > +test_expect_success '--expire-to sets appropriate repack options' ' > > + mkdir expired && > > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack && > > + test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt > > +' > > As "--cruft" is on by default, the command line does not have to > have it, but being explicit is good. > > Should we also see what happens when "--no-cruft" is given? > --expire-to with --no-cruft will still run repack -a, I will add corresponding tests. > Thanks. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4] gc: add `--expire-to` option 2025-01-16 2:35 ` [PATCH v3] gc: add `--expire-to` option ZheNing Hu via GitGitGadget 2025-01-16 18:23 ` Junio C Hamano @ 2025-01-24 7:49 ` ZheNing Hu via GitGitGadget 2025-02-04 18:15 ` Junio C Hamano 1 sibling, 1 reply; 13+ messages in thread From: ZheNing Hu via GitGitGadget @ 2025-01-24 7:49 UTC (permalink / raw) To: git; +Cc: gitster, me, peff, ZheNing Hu, ZheNing Hu From: ZheNing Hu <adlternative@gmail.com> This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, this feature was implemented in 91badeba32 (builtin/repack.c: implement `--expire-to` for storing pruned objects, 2022-10-24), which allowing users to specify a directory where unreachable and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. By introducing `--expire-to=<dir>` directly into `git gc`, we simplify the process for users who wish to manage their repository's cleanup more efficiently. This change involves passing the `--expire-to=<dir>` parameter through to `git repack`, making it easier for users to set up a backup location for cruft packs that will be pruned. Due to the original `git gc --prune=now` deleting all unreachable objects by passing the `-a` parameter to git repack. With the addition of the `--cruft` and `--expire-to` options, it is necessary to modify this default behavior: instead of deleting these unreachable objects, they should be merged into a cruft pack and collected in a specified directory. Therefore, we do not pass `-a` to the repack command but instead pass `--cruft`, `--expire-to`, and `--cruft-expiration=now` to repack. Signed-off-by: ZheNing Hu <adlternative@gmail.com> --- gc: add --expire-to option I want to perform a "safe" garbage collection for the Git repository on the server, which avoids data corruption issues caused by concurrent pushes during git GC. To achieve this, I currently need to use git repack --cruft --expire-to=<dir> and git prune in combination. However, it would be simpler if we could directly use --expire-to=<dir> with the git-gc command. v1: add --expire-to option to gc v1 -> v2: fix git gc --prune=now with --expire-to v2 -> v3: squash two patch into one patch v3 -> v4: modify docs, commit message, and give more tests Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1843%2Fadlternative%2Fzh%2Fgc-expire-to-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1843/adlternative/zh/gc-expire-to-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/1843 Range-diff vs v3: 1: 0842ec34948 ! 1: 6946ccde275 gc: add `--expire-to` option @@ Commit message This commit extends the functionality of `git gc` by adding a new option, `--expire-to=<dir>`. Previously, - this feature was implemented in `git repack` (see 91badeb), - allowing users to specify a directory where unreachable and - expired cruft packs are stored during garbage collection. + this feature was implemented in 91badeba32 (builtin/repack.c: + implement `--expire-to` for storing pruned objects, 2022-10-24), + which allowing users to specify a directory where unreachable + and expired cruft packs are stored during garbage collection. However, users had to run `git repack --cruft --expire-to=<dir>` followed by `git prune` to achieve similar results within `git gc`. @@ Commit message making it easier for users to set up a backup location for cruft packs that will be pruned. - Note: When git-gc is used with both `--cruft` and `--expire-to`, - it does not pass `-a` to git-repack to delete all unreachable - objects as `git gc --prune=now` originally did. Instead, it - generates a cruft pack in the directory specified by expire-to. + Due to the original `git gc --prune=now` deleting all unreachable + objects by passing the `-a` parameter to git repack. With the + addition of the `--cruft` and `--expire-to` options, it is necessary + to modify this default behavior: instead of deleting these + unreachable objects, they should be merged into a cruft pack and + collected in a specified directory. Therefore, we do not pass `-a` + to the repack command but instead pass `--cruft`, `--expire-to`, + and `--cruft-expiration=now` to repack. Signed-off-by: ZheNing Hu <adlternative@gmail.com> @@ Documentation/git-gc.txt: be performed as well. +--expire-to=<dir>:: + When packing unreachable objects into a cruft pack, write a cruft + pack containing pruned objects (if any) to the directory `<dir>`. ++ This option only has an effect when used together with `--cruft`. + See the `--expire-to` option of linkgit:git-repack[1] for -+ more. ++ more information. + --prune=<date>:: Prune loose objects older than date (default is 2 weeks ago, @@ t/t6500-gc.sh: test_expect_success 'gc.maxCruftSize sets appropriate repack opti test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt ' -+test_expect_success '--expire-to sets appropriate repack options' ' ++test_expect_success '--expire-to sets repack --expire-to' ' ++ rm -rf expired && + mkdir expired && -+ GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack && -+ test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt ++ expire_to="$(pwd)/expired/pack" && ++ GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to="$expire_to" && ++ test_subcommand $cruft_max_size_opts --expire-to="$expire_to" <trace2.txt ++' ++ ++test_expect_success '--expire-to with --prune=now sets repack --expire-to' ' ++ rm -rf expired && ++ mkdir expired && ++ expire_to="$(pwd)/expired/pack" && ++ GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --prune=now --expire-to="$expire_to" && ++ test_subcommand git repack -d -l --cruft --cruft-expiration=now --expire-to="$expire_to" <trace2.txt ++' ++ ++ ++test_expect_success '--expire-to with --no-cruft sets repack -A' ' ++ rm -rf expired && ++ mkdir expired && ++ expire_to="$(pwd)/expired/pack" && ++ GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --no-cruft --expire-to="$expire_to" && ++ test_subcommand git repack -d -l -A --unpack-unreachable=2.weeks.ago <trace2.txt ++' ++ ++test_expect_success '--expire-to with --no-cruft sets repack -a' ' ++ rm -rf expired && ++ mkdir expired && ++ expire_to="$(pwd)/expired/pack" && ++ GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --no-cruft --prune=now --expire-to="$expire_to" && ++ test_subcommand git repack -d -l -a <trace2.txt +' + run_and_wait_for_gc () { Documentation/git-gc.txt | 7 +++++++ builtin/gc.c | 9 +++++++-- t/t6500-gc.sh | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 47 insertions(+), 2 deletions(-) diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 370e22faaeb..0eac8e85f08 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -69,6 +69,13 @@ be performed as well. the `--max-cruft-size` option of linkgit:git-repack[1] for more. +--expire-to=<dir>:: + When packing unreachable objects into a cruft pack, write a cruft + pack containing pruned objects (if any) to the directory `<dir>`. + This option only has an effect when used together with `--cruft`. + See the `--expire-to` option of linkgit:git-repack[1] for + more information. + --prune=<date>:: Prune loose objects older than date (default is 2 weeks ago, overridable by the config variable `gc.pruneExpire`). diff --git a/builtin/gc.c b/builtin/gc.c index d52735354c9..8656e1caff0 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -136,6 +136,7 @@ struct gc_config { char *prune_worktrees_expire; char *repack_filter; char *repack_filter_to; + char *repack_expire_to; unsigned long big_pack_threshold; unsigned long max_delta_cache_size; }; @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) static void add_repack_all_option(struct gc_config *cfg, struct string_list *keep_pack) { - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") + && !(cfg->cruft_packs && cfg->repack_expire_to)) strvec_push(&repack, "-a"); else if (cfg->cruft_packs) { strvec_push(&repack, "--cruft"); @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg, if (cfg->max_cruft_size) strvec_pushf(&repack, "--max-cruft-size=%lu", cfg->max_cruft_size); + if (cfg->repack_expire_to) + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); } else { strvec_push(&repack, "-A"); if (cfg->prune_expire) @@ -675,7 +679,6 @@ struct repository *repo UNUSED) const char *prune_expire_sentinel = "sentinel"; const char *prune_expire_arg = prune_expire_sentinel; int ret; - struct option builtin_gc_options[] = { OPT__QUIET(&quiet, N_("suppress progress reporting")), { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), @@ -694,6 +697,8 @@ struct repository *repo UNUSED) PARSE_OPT_NOCOMPLETE), OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, N_("repack all other packs except the largest pack")), + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), + N_("pack prefix to store a pack containing pruned objects")), OPT_END() }; diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index ee074b99b70..74f7bd09046 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -339,6 +339,39 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt ' +test_expect_success '--expire-to sets repack --expire-to' ' + rm -rf expired && + mkdir expired && + expire_to="$(pwd)/expired/pack" && + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to="$expire_to" && + test_subcommand $cruft_max_size_opts --expire-to="$expire_to" <trace2.txt +' + +test_expect_success '--expire-to with --prune=now sets repack --expire-to' ' + rm -rf expired && + mkdir expired && + expire_to="$(pwd)/expired/pack" && + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --prune=now --expire-to="$expire_to" && + test_subcommand git repack -d -l --cruft --cruft-expiration=now --expire-to="$expire_to" <trace2.txt +' + + +test_expect_success '--expire-to with --no-cruft sets repack -A' ' + rm -rf expired && + mkdir expired && + expire_to="$(pwd)/expired/pack" && + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --no-cruft --expire-to="$expire_to" && + test_subcommand git repack -d -l -A --unpack-unreachable=2.weeks.ago <trace2.txt +' + +test_expect_success '--expire-to with --no-cruft sets repack -a' ' + rm -rf expired && + mkdir expired && + expire_to="$(pwd)/expired/pack" && + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --no-cruft --prune=now --expire-to="$expire_to" && + test_subcommand git repack -d -l -a <trace2.txt +' + run_and_wait_for_gc () { # We read stdout from gc for the side effect of waiting until the # background gc process exits, closing its fd 9. Furthermore, the base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f -- gitgitgadget ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4] gc: add `--expire-to` option 2025-01-24 7:49 ` [PATCH v4] " ZheNing Hu via GitGitGadget @ 2025-02-04 18:15 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2025-02-04 18:15 UTC (permalink / raw) To: ZheNing Hu via GitGitGadget; +Cc: git, me, peff, ZheNing Hu "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: ZheNing Hu <adlternative@gmail.com> > > This commit extends the functionality of `git gc` > by adding a new option, `--expire-to=<dir>`. Previously, > this feature was implemented in 91badeba32 (builtin/repack.c: > implement `--expire-to` for storing pruned objects, 2022-10-24), > which allowing users to specify a directory where unreachable > and expired cruft packs are stored during garbage collection. > However, users had to run `git repack --cruft --expire-to=<dir>` > followed by `git prune` to achieve similar results within `git gc`. > > By introducing `--expire-to=<dir>` directly into `git gc`, > we simplify the process for users who wish to manage their > repository's cleanup more efficiently. This change involves > passing the `--expire-to=<dir>` parameter through to `git repack`, > making it easier for users to set up a backup location for cruft > packs that will be pruned. > > Due to the original `git gc --prune=now` deleting all unreachable > objects by passing the `-a` parameter to git repack. With the > addition of the `--cruft` and `--expire-to` options, it is necessary > to modify this default behavior: instead of deleting these > unreachable objects, they should be merged into a cruft pack and > collected in a specified directory. Therefore, we do not pass `-a` > to the repack command but instead pass `--cruft`, `--expire-to`, > and `--cruft-expiration=now` to repack. > > Signed-off-by: ZheNing Hu <adlternative@gmail.com> > --- This hasn't seen any reaction for a while. Does anybody have further comments? Otherwise let's mark it for 'next'. Thanks. > Documentation/git-gc.txt | 7 +++++++ > builtin/gc.c | 9 +++++++-- > t/t6500-gc.sh | 33 +++++++++++++++++++++++++++++++++ > 3 files changed, 47 insertions(+), 2 deletions(-) > > diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt > index 370e22faaeb..0eac8e85f08 100644 > --- a/Documentation/git-gc.txt > +++ b/Documentation/git-gc.txt > @@ -69,6 +69,13 @@ be performed as well. > the `--max-cruft-size` option of linkgit:git-repack[1] for > more. > > +--expire-to=<dir>:: > + When packing unreachable objects into a cruft pack, write a cruft > + pack containing pruned objects (if any) to the directory `<dir>`. > + This option only has an effect when used together with `--cruft`. > + See the `--expire-to` option of linkgit:git-repack[1] for > + more information. > + > --prune=<date>:: > Prune loose objects older than date (default is 2 weeks ago, > overridable by the config variable `gc.pruneExpire`). > diff --git a/builtin/gc.c b/builtin/gc.c > index d52735354c9..8656e1caff0 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -136,6 +136,7 @@ struct gc_config { > char *prune_worktrees_expire; > char *repack_filter; > char *repack_filter_to; > + char *repack_expire_to; > unsigned long big_pack_threshold; > unsigned long max_delta_cache_size; > }; > @@ -432,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED) > static void add_repack_all_option(struct gc_config *cfg, > struct string_list *keep_pack) > { > - if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")) > + if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") > + && !(cfg->cruft_packs && cfg->repack_expire_to)) > strvec_push(&repack, "-a"); > else if (cfg->cruft_packs) { > strvec_push(&repack, "--cruft"); > @@ -441,6 +443,8 @@ static void add_repack_all_option(struct gc_config *cfg, > if (cfg->max_cruft_size) > strvec_pushf(&repack, "--max-cruft-size=%lu", > cfg->max_cruft_size); > + if (cfg->repack_expire_to) > + strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); > } else { > strvec_push(&repack, "-A"); > if (cfg->prune_expire) > @@ -675,7 +679,6 @@ struct repository *repo UNUSED) > const char *prune_expire_sentinel = "sentinel"; > const char *prune_expire_arg = prune_expire_sentinel; > int ret; > - > struct option builtin_gc_options[] = { > OPT__QUIET(&quiet, N_("suppress progress reporting")), > { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"), > @@ -694,6 +697,8 @@ struct repository *repo UNUSED) > PARSE_OPT_NOCOMPLETE), > OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack, > N_("repack all other packs except the largest pack")), > + OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"), > + N_("pack prefix to store a pack containing pruned objects")), > OPT_END() > }; > > diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh > index ee074b99b70..74f7bd09046 100755 > --- a/t/t6500-gc.sh > +++ b/t/t6500-gc.sh > @@ -339,6 +339,39 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' ' > test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt > ' > > +test_expect_success '--expire-to sets repack --expire-to' ' > + rm -rf expired && > + mkdir expired && > + expire_to="$(pwd)/expired/pack" && > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to="$expire_to" && > + test_subcommand $cruft_max_size_opts --expire-to="$expire_to" <trace2.txt > +' > + > +test_expect_success '--expire-to with --prune=now sets repack --expire-to' ' > + rm -rf expired && > + mkdir expired && > + expire_to="$(pwd)/expired/pack" && > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --prune=now --expire-to="$expire_to" && > + test_subcommand git repack -d -l --cruft --cruft-expiration=now --expire-to="$expire_to" <trace2.txt > +' > + > + > +test_expect_success '--expire-to with --no-cruft sets repack -A' ' > + rm -rf expired && > + mkdir expired && > + expire_to="$(pwd)/expired/pack" && > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --no-cruft --expire-to="$expire_to" && > + test_subcommand git repack -d -l -A --unpack-unreachable=2.weeks.ago <trace2.txt > +' > + > +test_expect_success '--expire-to with --no-cruft sets repack -a' ' > + rm -rf expired && > + mkdir expired && > + expire_to="$(pwd)/expired/pack" && > + GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --no-cruft --prune=now --expire-to="$expire_to" && > + test_subcommand git repack -d -l -a <trace2.txt > +' > + > run_and_wait_for_gc () { > # We read stdout from gc for the side effect of waiting until the > # background gc process exits, closing its fd 9. Furthermore, the > > base-commit: 92999a42db1c5f43f330e4f2bca4026b5b81576f ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-02-04 18:15 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-12-24 11:52 [PATCH] gc: add `--expire-to` option ZheNing Hu via GitGitGadget 2024-12-31 2:18 ` [PATCH v2 0/2] gc: add --expire-to option ZheNing Hu via GitGitGadget 2024-12-31 2:18 ` [PATCH v2 1/2] gc: add `--expire-to` option ZheNing Hu via GitGitGadget 2025-01-03 4:57 ` ZheNing Hu 2025-01-13 7:12 ` ZheNing Hu 2024-12-31 2:18 ` [PATCH v2 2/2] fix(gc): make --prune=now compatible with --expire-to ZheNing Hu via GitGitGadget 2025-01-13 9:17 ` Jeff King 2025-01-15 7:56 ` ZheNing Hu 2025-01-16 2:35 ` [PATCH v3] gc: add `--expire-to` option ZheNing Hu via GitGitGadget 2025-01-16 18:23 ` Junio C Hamano 2025-01-23 3:42 ` ZheNing Hu 2025-01-24 7:49 ` [PATCH v4] " ZheNing Hu via GitGitGadget 2025-02-04 18:15 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).