* 2.46 submodule breakage @ 2024-08-05 11:31 Jeppe Øland 2024-08-06 13:18 ` Jeppe Øland ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Jeppe Øland @ 2024-08-05 11:31 UTC (permalink / raw) To: git Hi there, I just upgraded from Git 2.45.2 to Git 2.46 (Git for Windows, but I don't think this has anything to do with that), and it seems to have broken some submodule stuff for me. Specifically, when I do a recursive pull, I get the following error from one of the submodules: fatal: Unable to find current revision in submodule path 'path/to/sub' There are other submodules in the repo that work fine. The only difference with this one is that it was added using the --name option, so internally it's just known as "sub". Looks like the problem happens in the final pull stage when it executes: git submodule--helper update --recursive --rebase -- A recursive clone with 2.46 works fine, but any subsequent recursive pull gives the error. So far I haven't been able to fix it, except by downgrading to 2.45.2. Any ideas? Regards, -Jeppe ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: 2.46 submodule breakage 2024-08-05 11:31 2.46 submodule breakage Jeppe Øland @ 2024-08-06 13:18 ` Jeppe Øland 2024-08-06 18:26 ` Eric Sunshine 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt 2 siblings, 1 reply; 35+ messages in thread From: Jeppe Øland @ 2024-08-06 13:18 UTC (permalink / raw) To: git Hi all, I did a bunch of testing to narrow down the cause: It is not related to the Windows port - all the testing was done in WSL. It only happens when the clone containing submodules is done with --ref-format=reftable. The commit breaking it is: 129cb1b99d hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions Regards, -Jeppe On Mon, Aug 5, 2024 at 1:31 PM Jeppe Øland <joland@gmail.com> wrote: > > Hi there, > > I just upgraded from Git 2.45.2 to Git 2.46 (Git for Windows, but I > don't think this has anything to do with that), and it seems to have > broken some submodule stuff for me. > > Specifically, when I do a recursive pull, I get the following error > from one of the submodules: > fatal: Unable to find current revision in submodule path 'path/to/sub' > > There are other submodules in the repo that work fine. > The only difference with this one is that it was added using the > --name option, so internally it's just known as "sub". > > Looks like the problem happens in the final pull stage when it executes: > git submodule--helper update --recursive --rebase -- > > A recursive clone with 2.46 works fine, but any subsequent recursive > pull gives the error. > So far I haven't been able to fix it, except by downgrading to 2.45.2. > > Any ideas? > > Regards, > -Jeppe ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: 2.46 submodule breakage 2024-08-06 13:18 ` Jeppe Øland @ 2024-08-06 18:26 ` Eric Sunshine 2024-08-07 6:40 ` Patrick Steinhardt 0 siblings, 1 reply; 35+ messages in thread From: Eric Sunshine @ 2024-08-06 18:26 UTC (permalink / raw) To: Jeppe Øland; +Cc: git, Patrick Steinhardt On Tue, Aug 6, 2024 at 9:21 AM Jeppe Øland <joland@gmail.com> wrote: > I did a bunch of testing to narrow down the cause: > > It is not related to the Windows port - all the testing was done in WSL. > It only happens when the clone containing submodules is done with > --ref-format=reftable. > The commit breaking it is: 129cb1b99d hash: drop (mostly) unused > `is_empty_{blob,tree}_sha1()` functions Cc:'ing Patrick, the author of that change. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: 2.46 submodule breakage 2024-08-06 18:26 ` Eric Sunshine @ 2024-08-07 6:40 ` Patrick Steinhardt 2024-08-07 7:38 ` Patrick Steinhardt 0 siblings, 1 reply; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 6:40 UTC (permalink / raw) To: Eric Sunshine; +Cc: Jeppe Øland, git [-- Attachment #1: Type: text/plain, Size: 2366 bytes --] On Tue, Aug 06, 2024 at 02:26:35PM -0400, Eric Sunshine wrote: > On Tue, Aug 6, 2024 at 9:21 AM Jeppe Øland <joland@gmail.com> wrote: > > I did a bunch of testing to narrow down the cause: > > > > It is not related to the Windows port - all the testing was done in WSL. > > It only happens when the clone containing submodules is done with > > --ref-format=reftable. > > The commit breaking it is: 129cb1b99d hash: drop (mostly) unused > > `is_empty_{blob,tree}_sha1()` functions > > Cc:'ing Patrick, the author of that change. Thanks for Cc'ing me, and thanks for the report. I just wanted to say that I couldn't reproduce, but then spotted that you said that it only happens when using `--ref-format=reftable` during the git-clone(1) step. And indeed, that causes the following test to fail: test_expect_success 'recursive pull with different non-default ref format' ' # Set up the initial structure. This is an upstream repository that has # a submodule, as well as a downstream clone of the upstream # repository. git init submodule && test_commit -C submodule submodule-base && git init upstream && test_commit -C upstream upstream-base && git -C upstream submodule add "file://$(pwd)/submodule" && git -C upstream commit -m "upstream submodule" && git clone --ref-format=reftable --recurse-submodules "file://$(pwd)/upstream" downstream && # Update the submodule. test_commit -C submodule submodule-update && git -C upstream/submodule pull && git -C upstream commit -am "update the submodule" && git -C downstream pull --recurse-submodules ' The issue here is that the recursive clone causes a mixture of ref formats: the parent repository uses the "reftable" backend, whereas the child repository uses "files". In theory, I think Git should be able to handle a mixture of ref formats like this, but I'm not surprised that it actually fails in practice. The question is whether this is sensible, or whether submodules should use the same ref format as their parent. So it feels like the commit you have bisected this to only unearths this issue, but isn't the actual root cause. I'll investigate further and will try to come up with a patch later this week. Patrick [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: 2.46 submodule breakage 2024-08-07 6:40 ` Patrick Steinhardt @ 2024-08-07 7:38 ` Patrick Steinhardt 2024-08-07 16:09 ` Junio C Hamano 0 siblings, 1 reply; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 7:38 UTC (permalink / raw) To: Eric Sunshine; +Cc: Jeppe Øland, git [-- Attachment #1: Type: text/plain, Size: 3370 bytes --] On Wed, Aug 07, 2024 at 08:40:10AM +0200, Patrick Steinhardt wrote: > On Tue, Aug 06, 2024 at 02:26:35PM -0400, Eric Sunshine wrote: > > On Tue, Aug 6, 2024 at 9:21 AM Jeppe Øland <joland@gmail.com> wrote: > > > I did a bunch of testing to narrow down the cause: > > > > > > It is not related to the Windows port - all the testing was done in WSL. > > > It only happens when the clone containing submodules is done with > > > --ref-format=reftable. > > > The commit breaking it is: 129cb1b99d hash: drop (mostly) unused > > > `is_empty_{blob,tree}_sha1()` functions > > > > Cc:'ing Patrick, the author of that change. > > Thanks for Cc'ing me, and thanks for the report. > > I just wanted to say that I couldn't reproduce, but then spotted that > you said that it only happens when using `--ref-format=reftable` during > the git-clone(1) step. And indeed, that causes the following test to > fail: > > test_expect_success 'recursive pull with different non-default ref format' ' > # Set up the initial structure. This is an upstream repository that has > # a submodule, as well as a downstream clone of the upstream > # repository. > git init submodule && > test_commit -C submodule submodule-base && > git init upstream && > test_commit -C upstream upstream-base && > git -C upstream submodule add "file://$(pwd)/submodule" && > git -C upstream commit -m "upstream submodule" && > git clone --ref-format=reftable --recurse-submodules "file://$(pwd)/upstream" downstream && > > # Update the submodule. > test_commit -C submodule submodule-update && > git -C upstream/submodule pull && > git -C upstream commit -am "update the submodule" && > > git -C downstream pull --recurse-submodules > ' > > The issue here is that the recursive clone causes a mixture of ref > formats: the parent repository uses the "reftable" backend, whereas the > child repository uses "files". In theory, I think Git should be able to > handle a mixture of ref formats like this, but I'm not surprised that it > actually fails in practice. The question is whether this is sensible, or > whether submodules should use the same ref format as their parent. > > So it feels like the commit you have bisected this to only unearths this > issue, but isn't the actual root cause. > > I'll investigate further and will try to come up with a patch later this > week. For the record, the fix is as simple as the below diff. We indeed end up initializing submodule ref stores with the parent ref storage format, not with the one of the subrepo. I'll spend some more time though to check whether other commands are impacted, as well, and write some more tests. Patrick diff --git a/refs.c b/refs.c index 915aeb4d1d..e4b1f4f8b1 100644 --- a/refs.c +++ b/refs.c @@ -2011,7 +2011,7 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo, free(subrepo); goto done; } - refs = ref_store_init(subrepo, the_repository->ref_storage_format, + refs = ref_store_init(subrepo, subrepo->ref_storage_format, submodule_sb.buf, REF_STORE_READ | REF_STORE_ODB); register_ref_store_map(&repo->submodule_ref_stores, "submodule", [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: 2.46 submodule breakage 2024-08-07 7:38 ` Patrick Steinhardt @ 2024-08-07 16:09 ` Junio C Hamano 0 siblings, 0 replies; 35+ messages in thread From: Junio C Hamano @ 2024-08-07 16:09 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: Eric Sunshine, Jeppe Øland, git Patrick Steinhardt <ps@pks.im> writes: > For the record, the fix is as simple as the below diff. We indeed end up > initializing submodule ref stores with the parent ref storage format, > not with the one of the subrepo. I'll spend some more time though to > check whether other commands are impacted, as well, and write some more > tests. I think allowing separate ref format in submodules is a very sensible thing to do. The containing superproject should not even have to know what ref backend is in use in the submodule, unlike the object format used there, which it has to know in order to store the name of commit objects in its tree entries as gitlinks. And the preimage of this particular change does look wrong to blindly reuse the format used in the superproject. Thanks. > diff --git a/refs.c b/refs.c > index 915aeb4d1d..e4b1f4f8b1 100644 > --- a/refs.c > +++ b/refs.c > @@ -2011,7 +2011,7 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo, > free(subrepo); > goto done; > } > - refs = ref_store_init(subrepo, the_repository->ref_storage_format, > + refs = ref_store_init(subrepo, subrepo->ref_storage_format, > submodule_sb.buf, > REF_STORE_READ | REF_STORE_ODB); > register_ref_store_map(&repo->submodule_ref_stores, "submodule", ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 0/6] Improvements for ref storage formats with submodules 2024-08-05 11:31 2.46 submodule breakage Jeppe Øland 2024-08-06 13:18 ` Jeppe Øland @ 2024-08-07 12:43 ` Patrick Steinhardt 2024-08-07 12:43 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt ` (7 more replies) 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt 2 siblings, 8 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 12:43 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 1795 bytes --] Hi, this small patch series contains some improvements for ref storage formats and their interaction with submodules. Notably: - Use the correct format for submodules in situations where the parent repository uses a different ref storage format than the submodule. - Wire up `--ref-format=` for git-submodule(1), such that users can explicitly use a different ref format for their submodules. - Propagate the `--ref-format=` flag of git-clone(1) into submodules when using `--recursive`. The first three patches implement improvements for the above three issues and introduce tests. The test did hit some memory leaks, which get fixed by patches 3 to 6 such that the new test can be marked as leak free. Thanks! Patrick Patrick Steinhardt (6): builtin/submodule: allow cloning with different ref storage format builtin/clone: propagate ref storage format to submodules refs: fix ref storage format for submodule ref stores submodule: fix leaking fetch tasks submodule: fix leaking seen submodule names object: fix leaking packfiles when closing object store Documentation/git-submodule.txt | 5 +- builtin/clone.c | 10 ++- builtin/submodule--helper.c | 30 +++++++ git-submodule.sh | 9 ++ object.c | 9 ++ refs.c | 2 +- submodule.c | 18 ++-- t/t5572-pull-submodule.sh | 1 + t/t7418-submodule-sparse-gitmodules.sh | 1 + t/t7424-submodule-mixed-ref-formats.sh | 120 +++++++++++++++++++++++++ 10 files changed, 191 insertions(+), 14 deletions(-) create mode 100755 t/t7424-submodule-mixed-ref-formats.sh -- 2.46.0.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt @ 2024-08-07 12:43 ` Patrick Steinhardt 2024-08-07 14:45 ` [PATCH 0/6] Improvements for ref storage formats with submodules Jeppe Øland 2024-08-07 22:55 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Junio C Hamano 2024-08-07 12:43 ` [PATCH 2/6] builtin/clone: propagate ref storage format to submodules Patrick Steinhardt ` (6 subsequent siblings) 7 siblings, 2 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 12:43 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 10349 bytes --] As submodules are proper self-contained repositories, it is perfectly valid for them to have a different ref storage format than their parent repository. There is no obvious way for users to ask for the ref storage format when initializing submodules though. Whether the setup of such mixed-ref-storage-format constellations is all that useful remains to be seen. But there is no good reason to not expose such an option, and we will require it in a subsequent patch. Introduce a new `--ref-format=` option for git-submodule(1) that allows the user to pick the ref storage format. This option will also be used in a subsequent commit, where we start to propagate the same flag from git-clone(1) to cloning submodules with the `--recursive` switch. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Documentation/git-submodule.txt | 5 +++- builtin/submodule--helper.c | 30 +++++++++++++++++++ git-submodule.sh | 9 ++++++ t/t7424-submodule-mixed-ref-formats.sh | 41 ++++++++++++++++++++++++++ 4 files changed, 84 insertions(+), 1 deletion(-) create mode 100755 t/t7424-submodule-mixed-ref-formats.sh diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index ca0347a37b..73ef8b9696 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -136,7 +136,7 @@ If you really want to remove a submodule from the repository and commit that use linkgit:git-rm[1] instead. See linkgit:gitsubmodules[7] for removal options. -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]:: +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--ref-format <format>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]:: + -- Update the registered submodules to match what the superproject @@ -185,6 +185,9 @@ submodule with the `--init` option. If `--recursive` is specified, this command will recurse into the registered submodules, and update any nested submodules within. +If `--ref-format <format>` is specified, the ref storage format of newly +cloned submodules will be set accordingly. + If `--filter <filter-spec>` is specified, the given partial clone filter will be applied to the submodule. See linkgit:git-rev-list[1] for details on filter specifications. diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index f1218a1995..42a36bc2f7 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -1532,6 +1532,7 @@ struct module_clone_data { const char *url; const char *depth; struct list_objects_filter_options *filter_options; + enum ref_storage_format ref_storage_format; unsigned int quiet: 1; unsigned int progress: 1; unsigned int dissociate: 1; @@ -1540,6 +1541,7 @@ struct module_clone_data { }; #define MODULE_CLONE_DATA_INIT { \ .single_branch = -1, \ + .ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN, \ } struct submodule_alternate_setup { @@ -1738,6 +1740,9 @@ static int clone_submodule(const struct module_clone_data *clone_data, strvec_pushl(&cp.args, "--reference", item->string, NULL); } + if (clone_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) + strvec_pushf(&cp.args, "--ref-format=%s", + ref_storage_format_to_name(clone_data->ref_storage_format)); if (clone_data->dissociate) strvec_push(&cp.args, "--dissociate"); if (sm_gitdir && *sm_gitdir) @@ -1832,6 +1837,7 @@ static int module_clone(int argc, const char **argv, const char *prefix) struct string_list reference = STRING_LIST_INIT_NODUP; struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT; + const char *ref_storage_format = NULL; struct option module_clone_options[] = { OPT_STRING(0, "prefix", &clone_data.prefix, @@ -1849,6 +1855,8 @@ static int module_clone(int argc, const char **argv, const char *prefix) OPT_STRING_LIST(0, "reference", &reference, N_("repo"), N_("reference repository")), + OPT_STRING(0, "ref-format", &ref_storage_format, N_("format"), + N_("specify the reference format to use")), OPT_BOOL(0, "dissociate", &dissociate, N_("use --reference only while cloning")), OPT_STRING(0, "depth", &clone_data.depth, @@ -1875,6 +1883,11 @@ static int module_clone(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, module_clone_options, git_submodule_helper_usage, 0); + if (ref_storage_format) { + clone_data.ref_storage_format = ref_storage_format_by_name(ref_storage_format); + if (clone_data.ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN) + die(_("unknown ref storage format '%s'"), ref_storage_format); + } clone_data.dissociate = !!dissociate; clone_data.quiet = !!quiet; clone_data.progress = !!progress; @@ -1974,6 +1987,7 @@ struct update_data { struct submodule_update_strategy update_strategy; struct list_objects_filter_options *filter_options; struct module_list list; + enum ref_storage_format ref_storage_format; int depth; int max_jobs; int single_branch; @@ -1997,6 +2011,7 @@ struct update_data { #define UPDATE_DATA_INIT { \ .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT, \ .list = MODULE_LIST_INIT, \ + .ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN, \ .recommend_shallow = -1, \ .references = STRING_LIST_INIT_DUP, \ .single_branch = -1, \ @@ -2132,6 +2147,9 @@ static int prepare_to_clone_next_submodule(const struct cache_entry *ce, expand_list_objects_filter_spec(suc->update_data->filter_options)); if (suc->update_data->require_init) strvec_push(&child->args, "--require-init"); + if (suc->update_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) + strvec_pushf(&child->args, "--ref-format=%s", + ref_storage_format_to_name(suc->update_data->ref_storage_format)); strvec_pushl(&child->args, "--path", sub->path, NULL); strvec_pushl(&child->args, "--name", sub->name, NULL); strvec_pushl(&child->args, "--url", url, NULL); @@ -2562,6 +2580,9 @@ static void update_data_to_args(const struct update_data *update_data, for_each_string_list_item(item, &update_data->references) strvec_pushl(args, "--reference", item->string, NULL); } + if (update_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) + strvec_pushf(args, "--ref-format=%s", + ref_storage_format_to_name(update_data->ref_storage_format)); if (update_data->filter_options && update_data->filter_options->choice) strvec_pushf(args, "--filter=%s", expand_list_objects_filter_spec( @@ -2737,6 +2758,7 @@ static int module_update(int argc, const char **argv, const char *prefix) struct update_data opt = UPDATE_DATA_INIT; struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT; + const char *ref_storage_format = NULL; int ret; struct option module_update_options[] = { OPT__SUPER_PREFIX(&opt.super_prefix), @@ -2760,6 +2782,8 @@ static int module_update(int argc, const char **argv, const char *prefix) SM_UPDATE_REBASE), OPT_STRING_LIST(0, "reference", &opt.references, N_("repo"), N_("reference repository")), + OPT_STRING(0, "ref-format", &ref_storage_format, N_("format"), + N_("specify the reference format to use")), OPT_BOOL(0, "dissociate", &opt.dissociate, N_("use --reference only while cloning")), OPT_INTEGER(0, "depth", &opt.depth, @@ -2803,6 +2827,12 @@ static int module_update(int argc, const char **argv, const char *prefix) module_update_options); } + if (ref_storage_format) { + opt.ref_storage_format = ref_storage_format_by_name(ref_storage_format); + if (opt.ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN) + die(_("unknown ref storage format '%s'"), ref_storage_format); + } + opt.filter_options = &filter_options; opt.prefix = prefix; diff --git a/git-submodule.sh b/git-submodule.sh index 7f9582d923..dcbe943071 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -268,6 +268,14 @@ cmd_update() -r|--rebase) rebase=1 ;; + --ref-format) + case "$2" in '') usage ;; esac + ref_format="--ref-format=$2" + shift + ;; + --ref-format=*) + ref_format="$1" + ;; --reference) case "$2" in '') usage ;; esac reference="--reference=$2" @@ -349,6 +357,7 @@ cmd_update() ${rebase:+--rebase} \ ${merge:+--merge} \ ${checkout:+--checkout} \ + ${ref_format:+"$ref_format"} \ ${reference:+"$reference"} \ ${dissociate:+"--dissociate"} \ ${depth:+"$depth"} \ diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh new file mode 100755 index 0000000000..de84007554 --- /dev/null +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -0,0 +1,41 @@ +#!/bin/sh + +test_description='submodules handle mixed ref storage formats' + +. ./test-lib.sh + +test_ref_format () { + echo "$2" >expect && + git -C "$1" rev-parse --show-ref-format >actual && + test_cmp expect actual +} + +for OTHER_FORMAT in files reftable +do + if test "$OTHER_FORMAT" = "$GIT_DEFAULT_REF_FORMAT" + then + continue + fi + +test_expect_success 'setup' ' + git config set --global protocol.file.allow always +' + +test_expect_success 'clone submodules with different ref storage format' ' + test_when_finished "rm -rf submodule upstream downstream" && + + git init submodule && + test_commit -C submodule submodule-initial && + git init upstream && + git -C upstream submodule add "file://$(pwd)/submodule" && + git -C upstream commit -m "upstream submodule" && + + git clone --no-recurse-submodules "file://$(pwd)/upstream" downstream && + test_ref_format downstream "$GIT_DEFAULT_REF_FORMAT" && + git -C downstream submodule update --init --ref-format=$OTHER_FORMAT && + test_ref_format downstream/submodule "$OTHER_FORMAT" +' + +done + +test_done -- 2.46.0.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 0/6] Improvements for ref storage formats with submodules 2024-08-07 12:43 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt @ 2024-08-07 14:45 ` Jeppe Øland 2024-08-07 22:55 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Junio C Hamano 1 sibling, 0 replies; 35+ messages in thread From: Jeppe Øland @ 2024-08-07 14:45 UTC (permalink / raw) To: git; +Cc: Patrick Steinhardt, Eric Sunshine > Just got a shower thought that this isn't quite there yet. Most > importantly, I think we should default to the ref storage format of the > parent submodule both when adding submodules and when cloning a > submodule into a preexisting repository. Yep, I was going to suggest exactly the same! Thanks for the quick response. Regards -Jeppe ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format 2024-08-07 12:43 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt 2024-08-07 14:45 ` [PATCH 0/6] Improvements for ref storage formats with submodules Jeppe Øland @ 2024-08-07 22:55 ` Junio C Hamano 2024-08-08 7:00 ` Patrick Steinhardt 1 sibling, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2024-08-07 22:55 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Jeppe Øland Patrick Steinhardt <ps@pks.im> writes: > -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]:: > +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--ref-format <format>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]:: > + > -- > Update the registered submodules to match what the superproject > @@ -185,6 +185,9 @@ submodule with the `--init` option. > If `--recursive` is specified, this command will recurse into the > registered submodules, and update any nested submodules within. > > +If `--ref-format <format>` is specified, the ref storage format of newly > +cloned submodules will be set accordingly. > + > If `--filter <filter-spec>` is specified, the given partial clone filter will be > applied to the submodule. See linkgit:git-rev-list[1] for details on filter > specifications. Presumably, if the named submodule has already been initialized, we are not converting its ref backend with --ref-format=<format> option when "git submodule update --ref-format=<format>" is run. Would it make sense to say it is an error to give it without "--init", I wonder. If so, we probably would need to see if other existing options like "--filter" also need a similar sanity check, if not already done. Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format 2024-08-07 22:55 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Junio C Hamano @ 2024-08-08 7:00 ` Patrick Steinhardt 2024-08-08 16:08 ` Junio C Hamano 0 siblings, 1 reply; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 2348 bytes --] On Wed, Aug 07, 2024 at 03:55:57PM -0700, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]:: > > +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--ref-format <format>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]:: > > + > > -- > > Update the registered submodules to match what the superproject > > @@ -185,6 +185,9 @@ submodule with the `--init` option. > > If `--recursive` is specified, this command will recurse into the > > registered submodules, and update any nested submodules within. > > > > +If `--ref-format <format>` is specified, the ref storage format of newly > > +cloned submodules will be set accordingly. > > + > > If `--filter <filter-spec>` is specified, the given partial clone filter will be > > applied to the submodule. See linkgit:git-rev-list[1] for details on filter > > specifications. > > Presumably, if the named submodule has already been initialized, we > are not converting its ref backend with --ref-format=<format> option > when "git submodule update --ref-format=<format>" is run. Would it > make sense to say it is an error to give it without "--init", I > wonder. If so, we probably would need to see if other existing > options like "--filter" also need a similar sanity check, if not > already done. Well, even when "--init" was given it is not sure whether the ref storage format will actually end up being used, because that option only tells us to initialize uninitialized submodules. So if the submodule was initialized already, then the ref storage format won't be used. We probably could add such a sanity check. But as you say, other options like "--filter", "--depth", "--reference" and "--recommend-shallow" don't have that check, either, so it would feel a bit like opening a can of worms. So personally, I'd rather defer this to another day where we then implement the check for all of these options. Patrick [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format 2024-08-08 7:00 ` Patrick Steinhardt @ 2024-08-08 16:08 ` Junio C Hamano 2024-08-08 16:19 ` Patrick Steinhardt 0 siblings, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2024-08-08 16:08 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Jeppe Øland Patrick Steinhardt <ps@pks.im> writes: >> Presumably, if the named submodule has already been initialized, we >> are not converting its ref backend with --ref-format=<format> option >> when "git submodule update --ref-format=<format>" is run. Would it >> make sense to say it is an error to give it without "--init", I >> wonder. If so, we probably would need to see if other existing >> options like "--filter" also need a similar sanity check, if not >> already done. > > Well, even when "--init" was given it is not sure whether the ref > storage format will actually end up being used, because that option only > tells us to initialize uninitialized submodules. So if the submodule was > initialized already, then the ref storage format won't be used. > > We probably could add such a sanity check. But as you say, other options > like "--filter", "--depth", "--reference" and "--recommend-shallow" > don't have that check, either, so it would feel a bit like opening a can > of worms. So personally, I'd rather defer this to another day where we > then implement the check for all of these options. I am perfectly fine if we stopped by clearly documenting that these options can be no-op when the submodule repository already exists. Failing the operation in the name of "sanity check" at this point, especially for existing options, does not sound like a sensible change. Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format 2024-08-08 16:08 ` Junio C Hamano @ 2024-08-08 16:19 ` Patrick Steinhardt 2024-08-08 17:26 ` Junio C Hamano 0 siblings, 1 reply; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 16:19 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Eric Sunshine, Jeppe Øland On August 8, 2024 6:08:01 PM GMT+02:00, Junio C Hamano <gitster@pobox.com> wrote: >Patrick Steinhardt <ps@pks.im> writes: > >>> Presumably, if the named submodule has already been initialized, we >>> are not converting its ref backend with --ref-format=<format> option >>> when "git submodule update --ref-format=<format>" is run. Would it >>> make sense to say it is an error to give it without "--init", I >>> wonder. If so, we probably would need to see if other existing >>> options like "--filter" also need a similar sanity check, if not >>> already done. >> >> Well, even when "--init" was given it is not sure whether the ref >> storage format will actually end up being used, because that option only >> tells us to initialize uninitialized submodules. So if the submodule was >> initialized already, then the ref storage format won't be used. >> >> We probably could add such a sanity check. But as you say, other options >> like "--filter", "--depth", "--reference" and "--recommend-shallow" >> don't have that check, either, so it would feel a bit like opening a can >> of worms. So personally, I'd rather defer this to another day where we >> then implement the check for all of these options. > >I am perfectly fine if we stopped by clearly documenting that these >options can be no-op when the submodule repository already exists. >Failing the operation in the name of "sanity check" at this point, >especially for existing options, does not sound like a sensible >change. > >Thanks. The documentation already points this out, as we explicitly say that it only has an impact on newly cloned submodules: > If `--ref-format <format>` is specified, the ref storage format of newly cloned submodules will be set accordingly. I'm happy to make this more explicit though, in case you don't think this is sufficient. Patrick ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format 2024-08-08 16:19 ` Patrick Steinhardt @ 2024-08-08 17:26 ` Junio C Hamano 0 siblings, 0 replies; 35+ messages in thread From: Junio C Hamano @ 2024-08-08 17:26 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Jeppe Øland Patrick Steinhardt <ps@pks.im> writes: > The documentation already points this out, as we explicitly say that it > only has an impact on newly cloned submodules: Perfect. Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 2/6] builtin/clone: propagate ref storage format to submodules 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt 2024-08-07 12:43 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt @ 2024-08-07 12:43 ` Patrick Steinhardt 2024-08-07 23:07 ` Junio C Hamano 2024-08-07 12:43 ` [PATCH 3/6] refs: fix ref storage format for submodule ref stores Patrick Steinhardt ` (5 subsequent siblings) 7 siblings, 1 reply; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 12:43 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 3461 bytes --] When recursively cloning a repository with a non-default ref storage format, e.g. by passing the `--ref-format=` option, then only the top-level repository will end up will end up using that ref storage format. All recursively cloned submodules will instead use the default format. While mixed-format constellations are expected to work alright, the outcome still is somewhat surprising as we have essentially ignored the user's request. Fix this by propagating the requested ref format to cloned submodules. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- builtin/clone.c | 10 ++++++++-- t/t7424-submodule-mixed-ref-formats.sh | 23 +++++++++++++++++++++++ 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index af6017d41a..75b15b5773 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -729,7 +729,8 @@ static int git_sparse_checkout_init(const char *repo) return result; } -static int checkout(int submodule_progress, int filter_submodules) +static int checkout(int submodule_progress, int filter_submodules, + enum ref_storage_format ref_storage_format) { struct object_id oid; char *head; @@ -813,6 +814,10 @@ static int checkout(int submodule_progress, int filter_submodules) strvec_push(&cmd.args, "--no-fetch"); } + if (ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) + strvec_pushf(&cmd.args, "--ref-format=%s", + ref_storage_format_to_name(ref_storage_format)); + if (filter_submodules && filter_options.choice) strvec_pushf(&cmd.args, "--filter=%s", expand_list_objects_filter_spec(&filter_options)); @@ -1536,7 +1541,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) return 1; junk_mode = JUNK_LEAVE_REPO; - err = checkout(submodule_progress, filter_submodules); + err = checkout(submodule_progress, filter_submodules, + ref_storage_format); free(remote_name); strbuf_release(&reflog_msg); diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh index de84007554..4e4991d04c 100755 --- a/t/t7424-submodule-mixed-ref-formats.sh +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -21,6 +21,29 @@ test_expect_success 'setup' ' git config set --global protocol.file.allow always ' +test_expect_success 'recursive clone propagates ref storage format' ' + test_when_finished "rm -rf submodule upstream downstream" && + + git init submodule && + test_commit -C submodule submodule-initial && + git init upstream && + git -C upstream submodule add "file://$(pwd)/submodule" && + git -C upstream commit -am "add submodule" && + + # The upstream repository and its submodule should be using the default + # ref format. + test_ref_format upstream "$GIT_DEFAULT_REF_FORMAT" && + test_ref_format upstream/submodule "$GIT_DEFAULT_REF_FORMAT" && + + # The cloned repositories should use the other ref format that we have + # specified via `--ref-format`. The option should propagate to cloned + # submodules. + git clone --ref-format=$OTHER_FORMAT --recurse-submodules \ + upstream downstream && + test_ref_format downstream "$OTHER_FORMAT" && + test_ref_format downstream/submodule "$OTHER_FORMAT" +' + test_expect_success 'clone submodules with different ref storage format' ' test_when_finished "rm -rf submodule upstream downstream" && -- 2.46.0.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 2/6] builtin/clone: propagate ref storage format to submodules 2024-08-07 12:43 ` [PATCH 2/6] builtin/clone: propagate ref storage format to submodules Patrick Steinhardt @ 2024-08-07 23:07 ` Junio C Hamano 0 siblings, 0 replies; 35+ messages in thread From: Junio C Hamano @ 2024-08-07 23:07 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Jeppe Øland Patrick Steinhardt <ps@pks.im> writes: > When recursively cloning a repository with a non-default ref storage > format, e.g. by passing the `--ref-format=` option, then only the > top-level repository will end up will end up using that ref storage > format. All recursively cloned submodules will instead use the default > format. While mixed-format constellations are expected to work alright, > the outcome still is somewhat surprising as we have essentially ignored > the user's request. > > Fix this by propagating the requested ref format to cloned submodules. Makes sense. ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 3/6] refs: fix ref storage format for submodule ref stores 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt 2024-08-07 12:43 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt 2024-08-07 12:43 ` [PATCH 2/6] builtin/clone: propagate ref storage format to submodules Patrick Steinhardt @ 2024-08-07 12:43 ` Patrick Steinhardt 2024-08-07 12:44 ` [PATCH 4/6] submodule: fix leaking fetch tasks Patrick Steinhardt ` (4 subsequent siblings) 7 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 12:43 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 4540 bytes --] When opening a submodule ref storage we accidentally use the ref storage format of the owning repository, not of the submodule repository. As submodules may have a different storage format than their parent repo this can lead to bugs when trying to access the submodule ref storage from the parent repository. One such bug was reported when performing a recursive pull with mixed ref stores, which fails with: $ git pull --recursive fatal: Unable to find current revision in submodule path 'path/to/sub' Fix the bug by using the submodule repository's ref storage format instead. Note that only the second added test fails without this fix. The other one is included regardless as it exercises other parts where we might end up accessing submodule ref stores. Reported-by: Jeppe Øland <joland@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> --- refs.c | 2 +- t/t7424-submodule-mixed-ref-formats.sh | 57 +++++++++++++++++++++++++- 2 files changed, 57 insertions(+), 2 deletions(-) diff --git a/refs.c b/refs.c index 915aeb4d1d..e4b1f4f8b1 100644 --- a/refs.c +++ b/refs.c @@ -2011,7 +2011,7 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo, free(subrepo); goto done; } - refs = ref_store_init(subrepo, the_repository->ref_storage_format, + refs = ref_store_init(subrepo, subrepo->ref_storage_format, submodule_sb.buf, REF_STORE_READ | REF_STORE_ODB); register_ref_store_map(&repo->submodule_ref_stores, "submodule", diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh index 4e4991d04c..998a5d82e9 100755 --- a/t/t7424-submodule-mixed-ref-formats.sh +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -18,7 +18,10 @@ do fi test_expect_success 'setup' ' - git config set --global protocol.file.allow always + git config set --global protocol.file.allow always && + # Some tests migrate the ref storage format, which does not work with + # reflogs at the time of writing these tests. + git config set --global core.logAllRefUpdates false ' test_expect_success 'recursive clone propagates ref storage format' ' @@ -59,6 +62,58 @@ test_expect_success 'clone submodules with different ref storage format' ' test_ref_format downstream/submodule "$OTHER_FORMAT" ' +test_expect_success 'status with mixed submodule ref storages' ' + test_when_finished "rm -rf submodule main" && + + git init submodule && + test_commit -C submodule submodule-initial && + git init main && + git -C main submodule add "file://$(pwd)/submodule" && + git -C main commit -m "add submodule" && + git -C main/submodule refs migrate --ref-format=$OTHER_FORMAT && + + # The main repository should use the default ref format now, whereas + # the submodule should use the other format. + test_ref_format main "$GIT_DEFAULT_REF_FORMAT" && + test_ref_format main/submodule "$OTHER_FORMAT" && + + cat >expect <<-EOF && + $(git -C main/submodule rev-parse HEAD) submodule (submodule-initial) + EOF + git -C main submodule status >actual && + test_cmp expect actual +' + +test_expect_success 'recursive pull with mixed formats' ' + test_when_finished "rm -rf submodule upstream downstream" && + + # Set up the initial structure with an upstream repository that has a + # submodule, as well as a downstream clone of the upstream repository. + git init submodule && + test_commit -C submodule submodule-initial && + git init upstream && + git -C upstream submodule add "file://$(pwd)/submodule" && + git -C upstream commit -m "upstream submodule" && + + # Clone the upstream repository such that the main repo and its + # submodules have different formats. + git clone --no-recurse-submodules "file://$(pwd)/upstream" downstream && + git -C downstream submodule update --init --ref-format=$OTHER_FORMAT && + test_ref_format downstream "$GIT_DEFAULT_REF_FORMAT" && + test_ref_format downstream/submodule "$OTHER_FORMAT" && + + # Update the upstream submodule as well as the owning repository such + # that we can do a recursive pull. + test_commit -C submodule submodule-update && + git -C upstream/submodule pull && + git -C upstream commit -am "update the submodule" && + + git -C downstream pull --recurse-submodules && + git -C upstream/submodule rev-parse HEAD >expect && + git -C downstream/submodule rev-parse HEAD >actual && + test_cmp expect actual +' + done test_done -- 2.46.0.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 4/6] submodule: fix leaking fetch tasks 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt ` (2 preceding siblings ...) 2024-08-07 12:43 ` [PATCH 3/6] refs: fix ref storage format for submodule ref stores Patrick Steinhardt @ 2024-08-07 12:44 ` Patrick Steinhardt 2024-08-07 12:44 ` [PATCH 5/6] submodule: fix leaking seen submodule names Patrick Steinhardt ` (3 subsequent siblings) 7 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 12:44 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 2557 bytes --] When done with a fetch task used for parallel fetches of submodules, we need to both call `fetch_task_release()` to release the task's contents and `free()` to release the task itself. Most sites do this already, but some only call `fetch_task_release()` and thus leak memory. While we could trivially fix this by adding the two missing calls to free(3P), the result would be that we always call both functions. Let's thus refactor the code such that `fetch_task_release()` also frees the structure itself. Rename it to `fetch_task_free()` accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- submodule.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/submodule.c b/submodule.c index ab99a30253..f027a6455e 100644 --- a/submodule.c +++ b/submodule.c @@ -1496,7 +1496,7 @@ static const struct submodule *get_non_gitmodules_submodule(const char *path) return (const struct submodule *) ret; } -static void fetch_task_release(struct fetch_task *p) +static void fetch_task_free(struct fetch_task *p) { if (p->free_sub) free((void*)p->sub); @@ -1508,6 +1508,7 @@ static void fetch_task_release(struct fetch_task *p) FREE_AND_NULL(p->repo); strvec_clear(&p->git_args); + free(p); } static struct repository *get_submodule_repo_for(struct repository *r, @@ -1576,8 +1577,7 @@ static struct fetch_task *fetch_task_create(struct submodule_parallel_fetch *spf return task; cleanup: - fetch_task_release(task); - free(task); + fetch_task_free(task); return NULL; } @@ -1607,8 +1607,7 @@ get_fetch_task_from_index(struct submodule_parallel_fetch *spf, } else { struct strbuf empty_submodule_path = STRBUF_INIT; - fetch_task_release(task); - free(task); + fetch_task_free(task); /* * An empty directory is normal, @@ -1654,8 +1653,7 @@ get_fetch_task_from_changed(struct submodule_parallel_fetch *spf, cs_data->path, repo_find_unique_abbrev(the_repository, cs_data->super_oid, DEFAULT_ABBREV)); - fetch_task_release(task); - free(task); + fetch_task_free(task); continue; } @@ -1763,7 +1761,7 @@ static int fetch_start_failure(struct strbuf *err UNUSED, spf->result = 1; - fetch_task_release(task); + fetch_task_free(task); return 0; } @@ -1828,8 +1826,7 @@ static int fetch_finish(int retvalue, struct strbuf *err UNUSED, } out: - fetch_task_release(task); - + fetch_task_free(task); return 0; } -- 2.46.0.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 5/6] submodule: fix leaking seen submodule names 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt ` (3 preceding siblings ...) 2024-08-07 12:44 ` [PATCH 4/6] submodule: fix leaking fetch tasks Patrick Steinhardt @ 2024-08-07 12:44 ` Patrick Steinhardt 2024-08-07 12:44 ` [PATCH 6/6] object: fix leaking packfiles when closing object store Patrick Steinhardt ` (2 subsequent siblings) 7 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 12:44 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 1743 bytes --] We keep track of submodules we have already seen via a string map such that we don't process the same submodule twice. We never free that map though, causing a memory leak. Fix this leak by clearing the map. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- submodule.c | 1 + t/t5572-pull-submodule.sh | 1 + t/t7418-submodule-sparse-gitmodules.sh | 1 + 3 files changed, 3 insertions(+) diff --git a/submodule.c b/submodule.c index f027a6455e..13b8f8c19c 100644 --- a/submodule.c +++ b/submodule.c @@ -1880,6 +1880,7 @@ int fetch_submodules(struct repository *r, strvec_clear(&spf.args); out: free_submodules_data(&spf.changed_submodule_names); + string_list_clear(&spf.seen_submodule_names, 0); return spf.result; } diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh index 51744521f7..916e58c166 100755 --- a/t/t5572-pull-submodule.sh +++ b/t/t5572-pull-submodule.sh @@ -5,6 +5,7 @@ test_description='pull can handle submodules' GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1 export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh . "$TEST_DIRECTORY"/lib-submodule-update.sh diff --git a/t/t7418-submodule-sparse-gitmodules.sh b/t/t7418-submodule-sparse-gitmodules.sh index dde11ecce8..e1d9bf2ee3 100755 --- a/t/t7418-submodule-sparse-gitmodules.sh +++ b/t/t7418-submodule-sparse-gitmodules.sh @@ -15,6 +15,7 @@ also by committing .gitmodules and then just removing it from the filesystem. GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1 export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_expect_success 'setup' ' -- 2.46.0.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 6/6] object: fix leaking packfiles when closing object store 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt ` (4 preceding siblings ...) 2024-08-07 12:44 ` [PATCH 5/6] submodule: fix leaking seen submodule names Patrick Steinhardt @ 2024-08-07 12:44 ` Patrick Steinhardt 2024-08-07 13:18 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt 2024-08-08 1:09 ` Junio C Hamano 7 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 12:44 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 1968 bytes --] When calling `raw_object_store_clear()`, we close and free several resources associated with the object store. Part of that is to close and free all the packfiles, which is handled by `close_object_store()`. That function really only ends up closing the packfiles though, but it doesn't free them. And in fact it can't, as that function is being called via `run_command()` when `close_object_store = 1`, which is done e.g. when we execute git-maintenance(1). At that point, other structures may still have references on those packfiles, and thus we cannot free them here. So while it is in fact intentional that we really only close them, the result is a memory leak because `raw_object_store_clear()` does not free them, either. Fix the leak by freeing the packfiles in `raw_object_store_clear()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- object.c | 9 +++++++++ t/t7424-submodule-mixed-ref-formats.sh | 1 + 2 files changed, 10 insertions(+) diff --git a/object.c b/object.c index 0c0fcb76c0..d756c7f2ea 100644 --- a/object.c +++ b/object.c @@ -614,6 +614,15 @@ void raw_object_store_clear(struct raw_object_store *o) INIT_LIST_HEAD(&o->packed_git_mru); close_object_store(o); + + /* + * `close_object_store()` only closes the packfiles, but doesn't free + * them. We thus have to do this manually. + */ + for (struct packed_git *p = o->packed_git, *next; p; p = next) { + next = p->next; + free(p); + } o->packed_git = NULL; hashmap_clear(&o->pack_map); diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh index 998a5d82e9..812496dc3b 100755 --- a/t/t7424-submodule-mixed-ref-formats.sh +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -2,6 +2,7 @@ test_description='submodules handle mixed ref storage formats' +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_ref_format () { -- 2.46.0.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 0/6] Improvements for ref storage formats with submodules 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt ` (5 preceding siblings ...) 2024-08-07 12:44 ` [PATCH 6/6] object: fix leaking packfiles when closing object store Patrick Steinhardt @ 2024-08-07 13:18 ` Patrick Steinhardt 2024-08-08 1:09 ` Junio C Hamano 7 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-07 13:18 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 1208 bytes --] On Wed, Aug 07, 2024 at 02:43:44PM +0200, Patrick Steinhardt wrote: > Hi, > > this small patch series contains some improvements for ref storage > formats and their interaction with submodules. Notably: > > - Use the correct format for submodules in situations where the parent > repository uses a different ref storage format than the submodule. > > - Wire up `--ref-format=` for git-submodule(1), such that users can > explicitly use a different ref format for their submodules. > > - Propagate the `--ref-format=` flag of git-clone(1) into submodules > when using `--recursive`. > > The first three patches implement improvements for the above three > issues and introduce tests. The test did hit some memory leaks, which > get fixed by patches 3 to 6 such that the new test can be marked as leak > free. Just got a shower thought that this isn't quite there yet. Most importantly, I think we should default to the ref storage format of the parent submodule both when adding submodules and when cloning a submodule into a preexisting repository. I'll wait until tomorrow before addressing this though to ideally get some feedback until then. Patrick [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 0/6] Improvements for ref storage formats with submodules 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt ` (6 preceding siblings ...) 2024-08-07 13:18 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt @ 2024-08-08 1:09 ` Junio C Hamano 2024-08-08 7:00 ` Patrick Steinhardt 7 siblings, 1 reply; 35+ messages in thread From: Junio C Hamano @ 2024-08-08 1:09 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Jeppe Øland Patrick Steinhardt <ps@pks.im> writes: > this small patch series contains some improvements for ref storage > formats and their interaction with submodules. Notably: > > - Use the correct format for submodules in situations where the parent > repository uses a different ref storage format than the submodule. > > - Wire up `--ref-format=` for git-submodule(1), such that users can > explicitly use a different ref format for their submodules. > > - Propagate the `--ref-format=` flag of git-clone(1) into submodules > when using `--recursive`. > > The first three patches implement improvements for the above three > issues and introduce tests. The test did hit some memory leaks, which > get fixed by patches 3 to 6 such that the new test can be marked as leak > free. Nicely done. I've read through the series and did not find any design decisions in the series questionable. As to propagating the choice of ref backend down from the superproject to the submodule, I am not sure if it matters all that much, so I view it as a relative low priority. If somebody wants to use a specific ref backend for their repositories, then they want all their "git init" (or init_db()) to use that ref backend, and would arrange configuration to make it so. When "git submodule init" internally calls "git init" (or init_db()), as long as we make sure such a choice would propagate to the new repository that happens to the one used for that submodule, we do not necessarily need to have a custom logic that says "ah, the user did not say anything about the ref backend, so let me peek the one used in the superproject and propagate it down". Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 0/6] Improvements for ref storage formats with submodules 2024-08-08 1:09 ` Junio C Hamano @ 2024-08-08 7:00 ` Patrick Steinhardt 0 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Eric Sunshine, Jeppe Øland [-- Attachment #1: Type: text/plain, Size: 996 bytes --] On Wed, Aug 07, 2024 at 06:09:04PM -0700, Junio C Hamano wrote: > As to propagating the choice of ref backend down from the > superproject to the submodule, I am not sure if it matters all that > much, so I view it as a relative low priority. If somebody wants to > use a specific ref backend for their repositories, then they want > all their "git init" (or init_db()) to use that ref backend, and > would arrange configuration to make it so. When "git submodule > init" internally calls "git init" (or init_db()), as long as we make > sure such a choice would propagate to the new repository that > happens to the one used for that submodule, we do not necessarily > need to have a custom logic that says "ah, the user did not say > anything about the ref backend, so let me peek the one used in the > superproject and propagate it down". You know, I'll just skip this for now. It feels somewhat orthogonal to the changes in this series, and we're not closing the door on anything. Patrick [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v2 0/8] Improvements for ref storage formats with submodules 2024-08-05 11:31 2.46 submodule breakage Jeppe Øland 2024-08-06 13:18 ` Jeppe Øland 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt @ 2024-08-08 7:35 ` Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 1/8] git-submodule.sh: break overly long command lines Patrick Steinhardt ` (8 more replies) 2 siblings, 9 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:35 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 4472 bytes --] Hi, this is the second version of my patch series that aims to improve interaction of ref storage formats with submodules. Changes compared to v1: - I noticed that `git submodule add` also used to fail when used with a preexisting repository that uses a different ref format. - Add a preparatory refactoring that wraps overly long command lines in "git-submodule.sh". - Wire up support for `--ref-format=` in `git submodule add`. Notably not included is a change to make `git submodule add/clone` honor the ref storage format of the parent repository. I think it is somewhat orthogonal to the changes I do in this series, and it is not entirely obvious whether it is a good idea or not. We can still do this in a future patch series, if we ever hear good arguments for it. Thanks! Patrick Steinhardt (8): git-submodule.sh: break overly long command lines builtin/submodule: allow cloning with different ref storage format builtin/clone: propagate ref storage format to submodules refs: fix ref storage format for submodule ref stores builtin/submodule: allow "add" to use different ref storage format submodule: fix leaking fetch tasks submodule: fix leaking seen submodule names object: fix leaking packfiles when closing object store Documentation/git-submodule.txt | 10 +- builtin/clone.c | 10 +- builtin/submodule--helper.c | 46 +++++++- git-submodule.sh | 82 ++++++++++++-- object.c | 9 ++ refs.c | 2 +- submodule.c | 18 ++-- t/t5572-pull-submodule.sh | 1 + t/t7418-submodule-sparse-gitmodules.sh | 1 + t/t7424-submodule-mixed-ref-formats.sh | 144 +++++++++++++++++++++++++ 10 files changed, 298 insertions(+), 25 deletions(-) create mode 100755 t/t7424-submodule-mixed-ref-formats.sh Range-diff against v1: -: ---------- > 1: 6513c6b17d git-submodule.sh: break overly long command lines 1: a450759bd1 = 2: e6cda43878 builtin/submodule: allow cloning with different ref storage format 2: e5923c0b33 = 3: ed314f5333 builtin/clone: propagate ref storage format to submodules 3: aaff9134ed ! 4: f13356581e refs: fix ref storage format for submodule ref stores @@ Commit message $ git pull --recursive fatal: Unable to find current revision in submodule path 'path/to/sub' - Fix the bug by using the submodule repository's ref storage format - instead. + The same issue occurs when adding a repository contained in the working + tree with a different ref storage format via `git submodule add`. - Note that only the second added test fails without this fix. The other - one is included regardless as it exercises other parts where we might - end up accessing submodule ref stores. + Fix the bug by using the submodule repository's ref storage format + instead and add some tests. Note that the test for `git submodule + status` was included as a precaution, only. The command worked alright + even without the bugfix. Reported-by: Jeppe Øland <joland@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> @@ t/t7424-submodule-mixed-ref-formats.sh: do + # Some tests migrate the ref storage format, which does not work with + # reflogs at the time of writing these tests. + git config set --global core.logAllRefUpdates false ++' ++ ++test_expect_success 'add existing repository with different ref storage format' ' ++ test_when_finished "rm -rf parent" && ++ ++ git init parent && ++ ( ++ cd parent && ++ test_commit parent && ++ git init --ref-format=$OTHER_FORMAT submodule && ++ test_commit -C submodule submodule && ++ git submodule add ./submodule ++ ) ' test_expect_success 'recursive clone propagates ref storage format' ' -: ---------- > 5: 4ce17e44a1 builtin/submodule: allow "add" to use different ref storage format 4: 8f8371c18a = 6: d92770290f submodule: fix leaking fetch tasks 5: 732142aaa6 = 7: e9421189ca submodule: fix leaking seen submodule names 6: 8dc7cc76d5 = 8: d05737c75f object: fix leaking packfiles when closing object store -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v2 1/8] git-submodule.sh: break overly long command lines 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt @ 2024-08-08 7:35 ` Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 2/8] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt ` (7 subsequent siblings) 8 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:35 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 4281 bytes --] For most of the subcommands of git-submodule(1), we end up passing a bunch of arguments to the submodule helper. This quickly leads to overly long lines, where it becomes hard to spot what has changed when one needs to modify them. Break up these lines into one argument per line, similarly to how it is done for the "clone" subcommand already. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- git-submodule.sh | 64 +++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 55 insertions(+), 9 deletions(-) diff --git a/git-submodule.sh b/git-submodule.sh index 7f9582d923..fd588b1864 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -129,7 +129,17 @@ cmd_add() usage fi - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper add ${quiet:+--quiet} ${force:+--force} ${progress:+"--progress"} ${branch:+--branch "$branch"} ${reference_path:+--reference "$reference_path"} ${dissociate:+--dissociate} ${custom_name:+--name "$custom_name"} ${depth:+"$depth"} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper add \ + ${quiet:+--quiet} \ + ${force:+--force} \ + ${progress:+"--progress"} \ + ${branch:+--branch "$branch"} \ + ${reference_path:+--reference "$reference_path"} \ + ${dissociate:+--dissociate} \ + ${custom_name:+--name "$custom_name"} \ + ${depth:+"$depth"} \ + -- \ + "$@" } # @@ -160,7 +170,11 @@ cmd_foreach() shift done - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper foreach ${quiet:+--quiet} ${recursive:+--recursive} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper foreach \ + ${quiet:+--quiet} \ + ${recursive:+--recursive} \ + -- \ + "$@" } # @@ -191,7 +205,10 @@ cmd_init() shift done - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper init ${quiet:+--quiet} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper init \ + ${quiet:+--quiet} \ + -- \ + "$@" } # @@ -227,7 +244,12 @@ cmd_deinit() shift done - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper deinit ${quiet:+--quiet} ${force:+--force} ${deinit_all:+--all} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper deinit \ + ${quiet:+--quiet} \ + ${force:+--force} \ + ${deinit_all:+--all} \ + -- \ + "$@" } # @@ -399,7 +421,12 @@ cmd_set_branch() { shift done - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper set-branch ${quiet:+--quiet} ${branch:+--branch "$branch"} ${default:+--default} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper set-branch \ + ${quiet:+--quiet} \ + ${branch:+--branch "$branch"} \ + ${default:+--default} \ + -- \ + "$@" } # @@ -428,7 +455,10 @@ cmd_set_url() { shift done - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper set-url ${quiet:+--quiet} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper set-url \ + ${quiet:+--quiet} \ + -- \ + "$@" } # @@ -480,7 +510,13 @@ cmd_summary() { shift done - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper summary ${files:+--files} ${cached:+--cached} ${for_status:+--for-status} ${summary_limit:+-n $summary_limit} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper summary \ + ${files:+--files} \ + ${cached:+--cached} \ + ${for_status:+--for-status} \ + ${summary_limit:+-n $summary_limit} \ + -- \ + "$@" } # # List all submodules, prefixed with: @@ -521,8 +557,14 @@ cmd_status() shift done - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper status ${quiet:+--quiet} ${cached:+--cached} ${recursive:+--recursive} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper status \ + ${quiet:+--quiet} \ + ${cached:+--cached} \ + ${recursive:+--recursive} \ + -- \ + "$@" } + # # Sync remote urls for submodules # This makes the value for remote.$remote.url match the value @@ -554,7 +596,11 @@ cmd_sync() esac done - git ${wt_prefix:+-C "$wt_prefix"} submodule--helper sync ${quiet:+--quiet} ${recursive:+--recursive} -- "$@" + git ${wt_prefix:+-C "$wt_prefix"} submodule--helper sync \ + ${quiet:+--quiet} \ + ${recursive:+--recursive} \ + -- \ + "$@" } cmd_absorbgitdirs() -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 2/8] builtin/submodule: allow cloning with different ref storage format 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 1/8] git-submodule.sh: break overly long command lines Patrick Steinhardt @ 2024-08-08 7:35 ` Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 3/8] builtin/clone: propagate ref storage format to submodules Patrick Steinhardt ` (6 subsequent siblings) 8 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:35 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 10364 bytes --] As submodules are proper self-contained repositories, it is perfectly valid for them to have a different ref storage format than their parent repository. There is no obvious way for users to ask for the ref storage format when initializing submodules though. Whether the setup of such mixed-ref-storage-format constellations is all that useful remains to be seen. But there is no good reason to not expose such an option, and we will require it in a subsequent patch. Introduce a new `--ref-format=` option for git-submodule(1) that allows the user to pick the ref storage format. This option will also be used in a subsequent commit, where we start to propagate the same flag from git-clone(1) to cloning submodules with the `--recursive` switch. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Documentation/git-submodule.txt | 5 +++- builtin/submodule--helper.c | 30 +++++++++++++++++++ git-submodule.sh | 9 ++++++ t/t7424-submodule-mixed-ref-formats.sh | 41 ++++++++++++++++++++++++++ 4 files changed, 84 insertions(+), 1 deletion(-) create mode 100755 t/t7424-submodule-mixed-ref-formats.sh diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index ca0347a37b..73ef8b9696 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -136,7 +136,7 @@ If you really want to remove a submodule from the repository and commit that use linkgit:git-rm[1] instead. See linkgit:gitsubmodules[7] for removal options. -update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]:: +update [--init] [--remote] [-N|--no-fetch] [--[no-]recommend-shallow] [-f|--force] [--checkout|--rebase|--merge] [--reference <repository>] [--ref-format <format>] [--depth <depth>] [--recursive] [--jobs <n>] [--[no-]single-branch] [--filter <filter-spec>] [--] [<path>...]:: + -- Update the registered submodules to match what the superproject @@ -185,6 +185,9 @@ submodule with the `--init` option. If `--recursive` is specified, this command will recurse into the registered submodules, and update any nested submodules within. +If `--ref-format <format>` is specified, the ref storage format of newly +cloned submodules will be set accordingly. + If `--filter <filter-spec>` is specified, the given partial clone filter will be applied to the submodule. See linkgit:git-rev-list[1] for details on filter specifications. diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index f1218a1995..42a36bc2f7 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -1532,6 +1532,7 @@ struct module_clone_data { const char *url; const char *depth; struct list_objects_filter_options *filter_options; + enum ref_storage_format ref_storage_format; unsigned int quiet: 1; unsigned int progress: 1; unsigned int dissociate: 1; @@ -1540,6 +1541,7 @@ struct module_clone_data { }; #define MODULE_CLONE_DATA_INIT { \ .single_branch = -1, \ + .ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN, \ } struct submodule_alternate_setup { @@ -1738,6 +1740,9 @@ static int clone_submodule(const struct module_clone_data *clone_data, strvec_pushl(&cp.args, "--reference", item->string, NULL); } + if (clone_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) + strvec_pushf(&cp.args, "--ref-format=%s", + ref_storage_format_to_name(clone_data->ref_storage_format)); if (clone_data->dissociate) strvec_push(&cp.args, "--dissociate"); if (sm_gitdir && *sm_gitdir) @@ -1832,6 +1837,7 @@ static int module_clone(int argc, const char **argv, const char *prefix) struct string_list reference = STRING_LIST_INIT_NODUP; struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT; + const char *ref_storage_format = NULL; struct option module_clone_options[] = { OPT_STRING(0, "prefix", &clone_data.prefix, @@ -1849,6 +1855,8 @@ static int module_clone(int argc, const char **argv, const char *prefix) OPT_STRING_LIST(0, "reference", &reference, N_("repo"), N_("reference repository")), + OPT_STRING(0, "ref-format", &ref_storage_format, N_("format"), + N_("specify the reference format to use")), OPT_BOOL(0, "dissociate", &dissociate, N_("use --reference only while cloning")), OPT_STRING(0, "depth", &clone_data.depth, @@ -1875,6 +1883,11 @@ static int module_clone(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, module_clone_options, git_submodule_helper_usage, 0); + if (ref_storage_format) { + clone_data.ref_storage_format = ref_storage_format_by_name(ref_storage_format); + if (clone_data.ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN) + die(_("unknown ref storage format '%s'"), ref_storage_format); + } clone_data.dissociate = !!dissociate; clone_data.quiet = !!quiet; clone_data.progress = !!progress; @@ -1974,6 +1987,7 @@ struct update_data { struct submodule_update_strategy update_strategy; struct list_objects_filter_options *filter_options; struct module_list list; + enum ref_storage_format ref_storage_format; int depth; int max_jobs; int single_branch; @@ -1997,6 +2011,7 @@ struct update_data { #define UPDATE_DATA_INIT { \ .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT, \ .list = MODULE_LIST_INIT, \ + .ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN, \ .recommend_shallow = -1, \ .references = STRING_LIST_INIT_DUP, \ .single_branch = -1, \ @@ -2132,6 +2147,9 @@ static int prepare_to_clone_next_submodule(const struct cache_entry *ce, expand_list_objects_filter_spec(suc->update_data->filter_options)); if (suc->update_data->require_init) strvec_push(&child->args, "--require-init"); + if (suc->update_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) + strvec_pushf(&child->args, "--ref-format=%s", + ref_storage_format_to_name(suc->update_data->ref_storage_format)); strvec_pushl(&child->args, "--path", sub->path, NULL); strvec_pushl(&child->args, "--name", sub->name, NULL); strvec_pushl(&child->args, "--url", url, NULL); @@ -2562,6 +2580,9 @@ static void update_data_to_args(const struct update_data *update_data, for_each_string_list_item(item, &update_data->references) strvec_pushl(args, "--reference", item->string, NULL); } + if (update_data->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) + strvec_pushf(args, "--ref-format=%s", + ref_storage_format_to_name(update_data->ref_storage_format)); if (update_data->filter_options && update_data->filter_options->choice) strvec_pushf(args, "--filter=%s", expand_list_objects_filter_spec( @@ -2737,6 +2758,7 @@ static int module_update(int argc, const char **argv, const char *prefix) struct update_data opt = UPDATE_DATA_INIT; struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT; + const char *ref_storage_format = NULL; int ret; struct option module_update_options[] = { OPT__SUPER_PREFIX(&opt.super_prefix), @@ -2760,6 +2782,8 @@ static int module_update(int argc, const char **argv, const char *prefix) SM_UPDATE_REBASE), OPT_STRING_LIST(0, "reference", &opt.references, N_("repo"), N_("reference repository")), + OPT_STRING(0, "ref-format", &ref_storage_format, N_("format"), + N_("specify the reference format to use")), OPT_BOOL(0, "dissociate", &opt.dissociate, N_("use --reference only while cloning")), OPT_INTEGER(0, "depth", &opt.depth, @@ -2803,6 +2827,12 @@ static int module_update(int argc, const char **argv, const char *prefix) module_update_options); } + if (ref_storage_format) { + opt.ref_storage_format = ref_storage_format_by_name(ref_storage_format); + if (opt.ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN) + die(_("unknown ref storage format '%s'"), ref_storage_format); + } + opt.filter_options = &filter_options; opt.prefix = prefix; diff --git a/git-submodule.sh b/git-submodule.sh index fd588b1864..448d58b18b 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -290,6 +290,14 @@ cmd_update() -r|--rebase) rebase=1 ;; + --ref-format) + case "$2" in '') usage ;; esac + ref_format="--ref-format=$2" + shift + ;; + --ref-format=*) + ref_format="$1" + ;; --reference) case "$2" in '') usage ;; esac reference="--reference=$2" @@ -371,6 +379,7 @@ cmd_update() ${rebase:+--rebase} \ ${merge:+--merge} \ ${checkout:+--checkout} \ + ${ref_format:+"$ref_format"} \ ${reference:+"$reference"} \ ${dissociate:+"--dissociate"} \ ${depth:+"$depth"} \ diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh new file mode 100755 index 0000000000..de84007554 --- /dev/null +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -0,0 +1,41 @@ +#!/bin/sh + +test_description='submodules handle mixed ref storage formats' + +. ./test-lib.sh + +test_ref_format () { + echo "$2" >expect && + git -C "$1" rev-parse --show-ref-format >actual && + test_cmp expect actual +} + +for OTHER_FORMAT in files reftable +do + if test "$OTHER_FORMAT" = "$GIT_DEFAULT_REF_FORMAT" + then + continue + fi + +test_expect_success 'setup' ' + git config set --global protocol.file.allow always +' + +test_expect_success 'clone submodules with different ref storage format' ' + test_when_finished "rm -rf submodule upstream downstream" && + + git init submodule && + test_commit -C submodule submodule-initial && + git init upstream && + git -C upstream submodule add "file://$(pwd)/submodule" && + git -C upstream commit -m "upstream submodule" && + + git clone --no-recurse-submodules "file://$(pwd)/upstream" downstream && + test_ref_format downstream "$GIT_DEFAULT_REF_FORMAT" && + git -C downstream submodule update --init --ref-format=$OTHER_FORMAT && + test_ref_format downstream/submodule "$OTHER_FORMAT" +' + +done + +test_done -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 3/8] builtin/clone: propagate ref storage format to submodules 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 1/8] git-submodule.sh: break overly long command lines Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 2/8] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt @ 2024-08-08 7:35 ` Patrick Steinhardt 2024-08-08 8:03 ` Kristoffer Haugsbakk 2024-08-08 7:35 ` [PATCH v2 4/8] refs: fix ref storage format for submodule ref stores Patrick Steinhardt ` (5 subsequent siblings) 8 siblings, 1 reply; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:35 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 3476 bytes --] When recursively cloning a repository with a non-default ref storage format, e.g. by passing the `--ref-format=` option, then only the top-level repository will end up will end up using that ref storage format. All recursively cloned submodules will instead use the default format. While mixed-format constellations are expected to work alright, the outcome still is somewhat surprising as we have essentially ignored the user's request. Fix this by propagating the requested ref format to cloned submodules. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- builtin/clone.c | 10 ++++++++-- t/t7424-submodule-mixed-ref-formats.sh | 23 +++++++++++++++++++++++ 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index af6017d41a..75b15b5773 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -729,7 +729,8 @@ static int git_sparse_checkout_init(const char *repo) return result; } -static int checkout(int submodule_progress, int filter_submodules) +static int checkout(int submodule_progress, int filter_submodules, + enum ref_storage_format ref_storage_format) { struct object_id oid; char *head; @@ -813,6 +814,10 @@ static int checkout(int submodule_progress, int filter_submodules) strvec_push(&cmd.args, "--no-fetch"); } + if (ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) + strvec_pushf(&cmd.args, "--ref-format=%s", + ref_storage_format_to_name(ref_storage_format)); + if (filter_submodules && filter_options.choice) strvec_pushf(&cmd.args, "--filter=%s", expand_list_objects_filter_spec(&filter_options)); @@ -1536,7 +1541,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) return 1; junk_mode = JUNK_LEAVE_REPO; - err = checkout(submodule_progress, filter_submodules); + err = checkout(submodule_progress, filter_submodules, + ref_storage_format); free(remote_name); strbuf_release(&reflog_msg); diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh index de84007554..4e4991d04c 100755 --- a/t/t7424-submodule-mixed-ref-formats.sh +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -21,6 +21,29 @@ test_expect_success 'setup' ' git config set --global protocol.file.allow always ' +test_expect_success 'recursive clone propagates ref storage format' ' + test_when_finished "rm -rf submodule upstream downstream" && + + git init submodule && + test_commit -C submodule submodule-initial && + git init upstream && + git -C upstream submodule add "file://$(pwd)/submodule" && + git -C upstream commit -am "add submodule" && + + # The upstream repository and its submodule should be using the default + # ref format. + test_ref_format upstream "$GIT_DEFAULT_REF_FORMAT" && + test_ref_format upstream/submodule "$GIT_DEFAULT_REF_FORMAT" && + + # The cloned repositories should use the other ref format that we have + # specified via `--ref-format`. The option should propagate to cloned + # submodules. + git clone --ref-format=$OTHER_FORMAT --recurse-submodules \ + upstream downstream && + test_ref_format downstream "$OTHER_FORMAT" && + test_ref_format downstream/submodule "$OTHER_FORMAT" +' + test_expect_success 'clone submodules with different ref storage format' ' test_when_finished "rm -rf submodule upstream downstream" && -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v2 3/8] builtin/clone: propagate ref storage format to submodules 2024-08-08 7:35 ` [PATCH v2 3/8] builtin/clone: propagate ref storage format to submodules Patrick Steinhardt @ 2024-08-08 8:03 ` Kristoffer Haugsbakk 2024-08-08 13:29 ` Patrick Steinhardt 0 siblings, 1 reply; 35+ messages in thread From: Kristoffer Haugsbakk @ 2024-08-08 8:03 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano, git Morning On Thu, Aug 8, 2024, at 09:35, Patrick Steinhardt wrote: > When recursively cloning a repository with a non-default ref storage > format, e.g. by passing the `--ref-format=` option, then only the > top-level repository will end up will end up Double “will end up”. -- Kristoffer Haugsbakk ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 3/8] builtin/clone: propagate ref storage format to submodules 2024-08-08 8:03 ` Kristoffer Haugsbakk @ 2024-08-08 13:29 ` Patrick Steinhardt 0 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 13:29 UTC (permalink / raw) To: Kristoffer Haugsbakk; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano, git [-- Attachment #1: Type: text/plain, Size: 478 bytes --] On Thu, Aug 08, 2024 at 10:03:57AM +0200, Kristoffer Haugsbakk wrote: > Morning > > On Thu, Aug 8, 2024, at 09:35, Patrick Steinhardt wrote: > > When recursively cloning a repository with a non-default ref storage > > format, e.g. by passing the `--ref-format=` option, then only the > > top-level repository will end up will end up > > Double “will end up”. Thanks. I've fixed this locally, but will wait for additional comments to come in first. Patrick [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v2 4/8] refs: fix ref storage format for submodule ref stores 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt ` (2 preceding siblings ...) 2024-08-08 7:35 ` [PATCH v2 3/8] builtin/clone: propagate ref storage format to submodules Patrick Steinhardt @ 2024-08-08 7:35 ` Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 5/8] builtin/submodule: allow "add" to use different ref storage format Patrick Steinhardt ` (4 subsequent siblings) 8 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:35 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 5006 bytes --] When opening a submodule ref storage we accidentally use the ref storage format of the owning repository, not of the submodule repository. As submodules may have a different storage format than their parent repo this can lead to bugs when trying to access the submodule ref storage from the parent repository. One such bug was reported when performing a recursive pull with mixed ref stores, which fails with: $ git pull --recursive fatal: Unable to find current revision in submodule path 'path/to/sub' The same issue occurs when adding a repository contained in the working tree with a different ref storage format via `git submodule add`. Fix the bug by using the submodule repository's ref storage format instead and add some tests. Note that the test for `git submodule status` was included as a precaution, only. The command worked alright even without the bugfix. Reported-by: Jeppe Øland <joland@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> --- refs.c | 2 +- t/t7424-submodule-mixed-ref-formats.sh | 70 +++++++++++++++++++++++++- 2 files changed, 70 insertions(+), 2 deletions(-) diff --git a/refs.c b/refs.c index 915aeb4d1d..e4b1f4f8b1 100644 --- a/refs.c +++ b/refs.c @@ -2011,7 +2011,7 @@ struct ref_store *repo_get_submodule_ref_store(struct repository *repo, free(subrepo); goto done; } - refs = ref_store_init(subrepo, the_repository->ref_storage_format, + refs = ref_store_init(subrepo, subrepo->ref_storage_format, submodule_sb.buf, REF_STORE_READ | REF_STORE_ODB); register_ref_store_map(&repo->submodule_ref_stores, "submodule", diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh index 4e4991d04c..d4e184970a 100755 --- a/t/t7424-submodule-mixed-ref-formats.sh +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -18,7 +18,23 @@ do fi test_expect_success 'setup' ' - git config set --global protocol.file.allow always + git config set --global protocol.file.allow always && + # Some tests migrate the ref storage format, which does not work with + # reflogs at the time of writing these tests. + git config set --global core.logAllRefUpdates false +' + +test_expect_success 'add existing repository with different ref storage format' ' + test_when_finished "rm -rf parent" && + + git init parent && + ( + cd parent && + test_commit parent && + git init --ref-format=$OTHER_FORMAT submodule && + test_commit -C submodule submodule && + git submodule add ./submodule + ) ' test_expect_success 'recursive clone propagates ref storage format' ' @@ -59,6 +75,58 @@ test_expect_success 'clone submodules with different ref storage format' ' test_ref_format downstream/submodule "$OTHER_FORMAT" ' +test_expect_success 'status with mixed submodule ref storages' ' + test_when_finished "rm -rf submodule main" && + + git init submodule && + test_commit -C submodule submodule-initial && + git init main && + git -C main submodule add "file://$(pwd)/submodule" && + git -C main commit -m "add submodule" && + git -C main/submodule refs migrate --ref-format=$OTHER_FORMAT && + + # The main repository should use the default ref format now, whereas + # the submodule should use the other format. + test_ref_format main "$GIT_DEFAULT_REF_FORMAT" && + test_ref_format main/submodule "$OTHER_FORMAT" && + + cat >expect <<-EOF && + $(git -C main/submodule rev-parse HEAD) submodule (submodule-initial) + EOF + git -C main submodule status >actual && + test_cmp expect actual +' + +test_expect_success 'recursive pull with mixed formats' ' + test_when_finished "rm -rf submodule upstream downstream" && + + # Set up the initial structure with an upstream repository that has a + # submodule, as well as a downstream clone of the upstream repository. + git init submodule && + test_commit -C submodule submodule-initial && + git init upstream && + git -C upstream submodule add "file://$(pwd)/submodule" && + git -C upstream commit -m "upstream submodule" && + + # Clone the upstream repository such that the main repo and its + # submodules have different formats. + git clone --no-recurse-submodules "file://$(pwd)/upstream" downstream && + git -C downstream submodule update --init --ref-format=$OTHER_FORMAT && + test_ref_format downstream "$GIT_DEFAULT_REF_FORMAT" && + test_ref_format downstream/submodule "$OTHER_FORMAT" && + + # Update the upstream submodule as well as the owning repository such + # that we can do a recursive pull. + test_commit -C submodule submodule-update && + git -C upstream/submodule pull && + git -C upstream commit -am "update the submodule" && + + git -C downstream pull --recurse-submodules && + git -C upstream/submodule rev-parse HEAD >expect && + git -C downstream/submodule rev-parse HEAD >actual && + test_cmp expect actual +' + done test_done -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 5/8] builtin/submodule: allow "add" to use different ref storage format 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt ` (3 preceding siblings ...) 2024-08-08 7:35 ` [PATCH v2 4/8] refs: fix ref storage format for submodule ref stores Patrick Steinhardt @ 2024-08-08 7:35 ` Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 6/8] submodule: fix leaking fetch tasks Patrick Steinhardt ` (3 subsequent siblings) 8 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:35 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 6070 bytes --] Same as with "clone", users may want to add a submodule to a repository with a non-default ref storage format. Wire up a new `--ref-format=` option that works the same as for `git submodule clone`. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- Documentation/git-submodule.txt | 5 ++++- builtin/submodule--helper.c | 16 +++++++++++++++- git-submodule.sh | 9 +++++++++ t/t7424-submodule-mixed-ref-formats.sh | 11 +++++++++++ 4 files changed, 39 insertions(+), 2 deletions(-) diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index 73ef8b9696..87d8e0f0c5 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -34,7 +34,7 @@ COMMANDS With no arguments, shows the status of existing submodules. Several subcommands are available to perform operations on the submodules. -add [-b <branch>] [-f|--force] [--name <name>] [--reference <repository>] [--depth <depth>] [--] <repository> [<path>]:: +add [-b <branch>] [-f|--force] [--name <name>] [--reference <repository>] [--ref-format <format>] [--depth <depth>] [--] <repository> [<path>]:: Add the given repository as a submodule at the given path to the changeset to be committed next to the current project: the current project is termed the "superproject". @@ -71,6 +71,9 @@ submodule repositories will be kept together in the same relative location, and only the superproject's URL needs to be provided. git-submodule will correctly locate the submodule using the relative URL in `.gitmodules`. ++ +If `--ref-format <format>` is specified, the ref storage format of newly +cloned submodules will be set accordingly. status [--cached] [--recursive] [--] [<path>...]:: Show the status of the submodules. This will print the SHA-1 of the diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 42a36bc2f7..48f4577b53 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -3128,13 +3128,17 @@ struct add_data { const char *sm_name; const char *repo; const char *realrepo; + enum ref_storage_format ref_storage_format; int depth; unsigned int force: 1; unsigned int quiet: 1; unsigned int progress: 1; unsigned int dissociate: 1; }; -#define ADD_DATA_INIT { .depth = -1 } +#define ADD_DATA_INIT { \ + .depth = -1, \ + .ref_storage_format = REF_STORAGE_FORMAT_UNKNOWN, \ +} static void append_fetch_remotes(struct strbuf *msg, const char *git_dir_path) { @@ -3228,6 +3232,7 @@ static int add_submodule(const struct add_data *add_data) string_list_append(&reference, p)->util = p; } + clone_data.ref_storage_format = add_data->ref_storage_format; clone_data.dissociate = add_data->dissociate; if (add_data->depth >= 0) clone_data.depth = xstrfmt("%d", add_data->depth); @@ -3392,6 +3397,7 @@ static int module_add(int argc, const char **argv, const char *prefix) { int force = 0, quiet = 0, progress = 0, dissociate = 0; struct add_data add_data = ADD_DATA_INIT; + const char *ref_storage_format = NULL; char *to_free = NULL; struct option options[] = { OPT_STRING('b', "branch", &add_data.branch, N_("branch"), @@ -3402,6 +3408,8 @@ static int module_add(int argc, const char **argv, const char *prefix) OPT_BOOL(0, "progress", &progress, N_("force cloning progress")), OPT_STRING(0, "reference", &add_data.reference_path, N_("repository"), N_("reference repository")), + OPT_STRING(0, "ref-format", &ref_storage_format, N_("format"), + N_("specify the reference format to use")), OPT_BOOL(0, "dissociate", &dissociate, N_("borrow the objects from reference repositories")), OPT_STRING(0, "name", &add_data.sm_name, N_("name"), N_("sets the submodule's name to the given string " @@ -3428,6 +3436,12 @@ static int module_add(int argc, const char **argv, const char *prefix) if (argc == 0 || argc > 2) usage_with_options(usage, options); + if (ref_storage_format) { + add_data.ref_storage_format = ref_storage_format_by_name(ref_storage_format); + if (add_data.ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN) + die(_("unknown ref storage format '%s'"), ref_storage_format); + } + add_data.repo = argv[0]; if (argc == 1) add_data.sm_path = git_url_basename(add_data.repo, 0, 0); diff --git a/git-submodule.sh b/git-submodule.sh index 448d58b18b..03c5a220a2 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -94,6 +94,14 @@ cmd_add() --reference=*) reference_path="${1#--reference=}" ;; + --ref-format) + case "$2" in '') usage ;; esac + ref_format="--ref-format=$2" + shift + ;; + --ref-format=*) + ref_format="$1" + ;; --dissociate) dissociate=1 ;; @@ -135,6 +143,7 @@ cmd_add() ${progress:+"--progress"} \ ${branch:+--branch "$branch"} \ ${reference_path:+--reference "$reference_path"} \ + ${ref_format:+"$ref_format"} \ ${dissociate:+--dissociate} \ ${custom_name:+--name "$custom_name"} \ ${depth:+"$depth"} \ diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh index d4e184970a..559713b607 100755 --- a/t/t7424-submodule-mixed-ref-formats.sh +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -37,6 +37,17 @@ test_expect_success 'add existing repository with different ref storage format' ) ' +test_expect_success 'add submodules with different ref storage format' ' + test_when_finished "rm -rf submodule upstream" && + + git init submodule && + test_commit -C submodule submodule-initial && + git init upstream && + test_ref_format upstream "$GIT_DEFAULT_REF_FORMAT" && + git -C upstream submodule add --ref-format="$OTHER_FORMAT" "file://$(pwd)/submodule" && + test_ref_format upstream/submodule "$OTHER_FORMAT" +' + test_expect_success 'recursive clone propagates ref storage format' ' test_when_finished "rm -rf submodule upstream downstream" && -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 6/8] submodule: fix leaking fetch tasks 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt ` (4 preceding siblings ...) 2024-08-08 7:35 ` [PATCH v2 5/8] builtin/submodule: allow "add" to use different ref storage format Patrick Steinhardt @ 2024-08-08 7:35 ` Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 7/8] submodule: fix leaking seen submodule names Patrick Steinhardt ` (2 subsequent siblings) 8 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:35 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 2572 bytes --] When done with a fetch task used for parallel fetches of submodules, we need to both call `fetch_task_release()` to release the task's contents and `free()` to release the task itself. Most sites do this already, but some only call `fetch_task_release()` and thus leak memory. While we could trivially fix this by adding the two missing calls to free(3P), the result would be that we always call both functions. Let's thus refactor the code such that `fetch_task_release()` also frees the structure itself. Rename it to `fetch_task_free()` accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- submodule.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/submodule.c b/submodule.c index ab99a30253..f027a6455e 100644 --- a/submodule.c +++ b/submodule.c @@ -1496,7 +1496,7 @@ static const struct submodule *get_non_gitmodules_submodule(const char *path) return (const struct submodule *) ret; } -static void fetch_task_release(struct fetch_task *p) +static void fetch_task_free(struct fetch_task *p) { if (p->free_sub) free((void*)p->sub); @@ -1508,6 +1508,7 @@ static void fetch_task_release(struct fetch_task *p) FREE_AND_NULL(p->repo); strvec_clear(&p->git_args); + free(p); } static struct repository *get_submodule_repo_for(struct repository *r, @@ -1576,8 +1577,7 @@ static struct fetch_task *fetch_task_create(struct submodule_parallel_fetch *spf return task; cleanup: - fetch_task_release(task); - free(task); + fetch_task_free(task); return NULL; } @@ -1607,8 +1607,7 @@ get_fetch_task_from_index(struct submodule_parallel_fetch *spf, } else { struct strbuf empty_submodule_path = STRBUF_INIT; - fetch_task_release(task); - free(task); + fetch_task_free(task); /* * An empty directory is normal, @@ -1654,8 +1653,7 @@ get_fetch_task_from_changed(struct submodule_parallel_fetch *spf, cs_data->path, repo_find_unique_abbrev(the_repository, cs_data->super_oid, DEFAULT_ABBREV)); - fetch_task_release(task); - free(task); + fetch_task_free(task); continue; } @@ -1763,7 +1761,7 @@ static int fetch_start_failure(struct strbuf *err UNUSED, spf->result = 1; - fetch_task_release(task); + fetch_task_free(task); return 0; } @@ -1828,8 +1826,7 @@ static int fetch_finish(int retvalue, struct strbuf *err UNUSED, } out: - fetch_task_release(task); - + fetch_task_free(task); return 0; } -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 7/8] submodule: fix leaking seen submodule names 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt ` (5 preceding siblings ...) 2024-08-08 7:35 ` [PATCH v2 6/8] submodule: fix leaking fetch tasks Patrick Steinhardt @ 2024-08-08 7:35 ` Patrick Steinhardt 2024-08-08 7:36 ` [PATCH v2 8/8] object: fix leaking packfiles when closing object store Patrick Steinhardt 2024-08-08 16:24 ` [PATCH v2 0/8] Improvements for ref storage formats with submodules Junio C Hamano 8 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:35 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 1758 bytes --] We keep track of submodules we have already seen via a string map such that we don't process the same submodule twice. We never free that map though, causing a memory leak. Fix this leak by clearing the map. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- submodule.c | 1 + t/t5572-pull-submodule.sh | 1 + t/t7418-submodule-sparse-gitmodules.sh | 1 + 3 files changed, 3 insertions(+) diff --git a/submodule.c b/submodule.c index f027a6455e..13b8f8c19c 100644 --- a/submodule.c +++ b/submodule.c @@ -1880,6 +1880,7 @@ int fetch_submodules(struct repository *r, strvec_clear(&spf.args); out: free_submodules_data(&spf.changed_submodule_names); + string_list_clear(&spf.seen_submodule_names, 0); return spf.result; } diff --git a/t/t5572-pull-submodule.sh b/t/t5572-pull-submodule.sh index 51744521f7..916e58c166 100755 --- a/t/t5572-pull-submodule.sh +++ b/t/t5572-pull-submodule.sh @@ -5,6 +5,7 @@ test_description='pull can handle submodules' GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1 export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh . "$TEST_DIRECTORY"/lib-submodule-update.sh diff --git a/t/t7418-submodule-sparse-gitmodules.sh b/t/t7418-submodule-sparse-gitmodules.sh index dde11ecce8..e1d9bf2ee3 100755 --- a/t/t7418-submodule-sparse-gitmodules.sh +++ b/t/t7418-submodule-sparse-gitmodules.sh @@ -15,6 +15,7 @@ also by committing .gitmodules and then just removing it from the filesystem. GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=1 export GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_expect_success 'setup' ' -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 8/8] object: fix leaking packfiles when closing object store 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt ` (6 preceding siblings ...) 2024-08-08 7:35 ` [PATCH v2 7/8] submodule: fix leaking seen submodule names Patrick Steinhardt @ 2024-08-08 7:36 ` Patrick Steinhardt 2024-08-08 16:24 ` [PATCH v2 0/8] Improvements for ref storage formats with submodules Junio C Hamano 8 siblings, 0 replies; 35+ messages in thread From: Patrick Steinhardt @ 2024-08-08 7:36 UTC (permalink / raw) To: git; +Cc: Eric Sunshine, Jeppe Øland, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 1983 bytes --] When calling `raw_object_store_clear()`, we close and free several resources associated with the object store. Part of that is to close and free all the packfiles, which is handled by `close_object_store()`. That function really only ends up closing the packfiles though, but it doesn't free them. And in fact it can't, as that function is being called via `run_command()` when `close_object_store = 1`, which is done e.g. when we execute git-maintenance(1). At that point, other structures may still have references on those packfiles, and thus we cannot free them here. So while it is in fact intentional that we really only close them, the result is a memory leak because `raw_object_store_clear()` does not free them, either. Fix the leak by freeing the packfiles in `raw_object_store_clear()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- object.c | 9 +++++++++ t/t7424-submodule-mixed-ref-formats.sh | 1 + 2 files changed, 10 insertions(+) diff --git a/object.c b/object.c index 0c0fcb76c0..d756c7f2ea 100644 --- a/object.c +++ b/object.c @@ -614,6 +614,15 @@ void raw_object_store_clear(struct raw_object_store *o) INIT_LIST_HEAD(&o->packed_git_mru); close_object_store(o); + + /* + * `close_object_store()` only closes the packfiles, but doesn't free + * them. We thus have to do this manually. + */ + for (struct packed_git *p = o->packed_git, *next; p; p = next) { + next = p->next; + free(p); + } o->packed_git = NULL; hashmap_clear(&o->pack_map); diff --git a/t/t7424-submodule-mixed-ref-formats.sh b/t/t7424-submodule-mixed-ref-formats.sh index 559713b607..b43ef2ba67 100755 --- a/t/t7424-submodule-mixed-ref-formats.sh +++ b/t/t7424-submodule-mixed-ref-formats.sh @@ -2,6 +2,7 @@ test_description='submodules handle mixed ref storage formats' +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_ref_format () { -- 2.46.0.46.g406f326d27.dirty [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v2 0/8] Improvements for ref storage formats with submodules 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt ` (7 preceding siblings ...) 2024-08-08 7:36 ` [PATCH v2 8/8] object: fix leaking packfiles when closing object store Patrick Steinhardt @ 2024-08-08 16:24 ` Junio C Hamano 8 siblings, 0 replies; 35+ messages in thread From: Junio C Hamano @ 2024-08-08 16:24 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git, Eric Sunshine, Jeppe Øland Patrick Steinhardt <ps@pks.im> writes: > - Add a preparatory refactoring that wraps overly long command lines > in "git-submodule.sh". This is very much appreciated. > > - Wire up support for `--ref-format=` in `git submodule add`. Good find. Thanks. ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2024-08-08 17:26 UTC | newest] Thread overview: 35+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-05 11:31 2.46 submodule breakage Jeppe Øland 2024-08-06 13:18 ` Jeppe Øland 2024-08-06 18:26 ` Eric Sunshine 2024-08-07 6:40 ` Patrick Steinhardt 2024-08-07 7:38 ` Patrick Steinhardt 2024-08-07 16:09 ` Junio C Hamano 2024-08-07 12:43 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt 2024-08-07 12:43 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt 2024-08-07 14:45 ` [PATCH 0/6] Improvements for ref storage formats with submodules Jeppe Øland 2024-08-07 22:55 ` [PATCH 1/6] builtin/submodule: allow cloning with different ref storage format Junio C Hamano 2024-08-08 7:00 ` Patrick Steinhardt 2024-08-08 16:08 ` Junio C Hamano 2024-08-08 16:19 ` Patrick Steinhardt 2024-08-08 17:26 ` Junio C Hamano 2024-08-07 12:43 ` [PATCH 2/6] builtin/clone: propagate ref storage format to submodules Patrick Steinhardt 2024-08-07 23:07 ` Junio C Hamano 2024-08-07 12:43 ` [PATCH 3/6] refs: fix ref storage format for submodule ref stores Patrick Steinhardt 2024-08-07 12:44 ` [PATCH 4/6] submodule: fix leaking fetch tasks Patrick Steinhardt 2024-08-07 12:44 ` [PATCH 5/6] submodule: fix leaking seen submodule names Patrick Steinhardt 2024-08-07 12:44 ` [PATCH 6/6] object: fix leaking packfiles when closing object store Patrick Steinhardt 2024-08-07 13:18 ` [PATCH 0/6] Improvements for ref storage formats with submodules Patrick Steinhardt 2024-08-08 1:09 ` Junio C Hamano 2024-08-08 7:00 ` Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 0/8] " Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 1/8] git-submodule.sh: break overly long command lines Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 2/8] builtin/submodule: allow cloning with different ref storage format Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 3/8] builtin/clone: propagate ref storage format to submodules Patrick Steinhardt 2024-08-08 8:03 ` Kristoffer Haugsbakk 2024-08-08 13:29 ` Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 4/8] refs: fix ref storage format for submodule ref stores Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 5/8] builtin/submodule: allow "add" to use different ref storage format Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 6/8] submodule: fix leaking fetch tasks Patrick Steinhardt 2024-08-08 7:35 ` [PATCH v2 7/8] submodule: fix leaking seen submodule names Patrick Steinhardt 2024-08-08 7:36 ` [PATCH v2 8/8] object: fix leaking packfiles when closing object store Patrick Steinhardt 2024-08-08 16:24 ` [PATCH v2 0/8] Improvements for ref storage formats with submodules Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).