* [PATCH 0/3] setup: fix reinit of repos with different formats @ 2025-01-30 16:24 Patrick Steinhardt 2025-01-30 16:24 ` [PATCH 1/3] t0001: remove duplicate test Patrick Steinhardt ` (3 more replies) 0 siblings, 4 replies; 11+ messages in thread From: Patrick Steinhardt @ 2025-01-30 16:24 UTC (permalink / raw) To: git Hi, this issue with the reinitialization of the ref format was recently discovered in our CI systems at GitLab, where caches contained repos with one ref format but we tried to reinit them with a different ref format. But turns out that the same issue also exists for the object format, so this patch series fixes both issues. Thanks! Patrick --- Patrick Steinhardt (3): t0001: remove duplicate test setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH setup.c | 8 ++++++-- t/t0001-init.sh | 30 +++++++++++++++++++++--------- 2 files changed, 27 insertions(+), 11 deletions(-) --- base-commit: 3b0d05c4a79d0e441283680a864529b02dca5f08 change-id: 20250130-b4-pks-reinit-default-ref-format-96a5e2104421 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/3] t0001: remove duplicate test 2025-01-30 16:24 [PATCH 0/3] setup: fix reinit of repos with different formats Patrick Steinhardt @ 2025-01-30 16:24 ` Patrick Steinhardt 2025-01-30 16:24 ` [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT Patrick Steinhardt ` (2 subsequent siblings) 3 siblings, 0 replies; 11+ messages in thread From: Patrick Steinhardt @ 2025-01-30 16:24 UTC (permalink / raw) To: git The test in question is an exact copy of the testcase preceding it. Remove it. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- t/t0001-init.sh | 9 --------- 1 file changed, 9 deletions(-) diff --git a/t/t0001-init.sh b/t/t0001-init.sh index 72a0c2e7d4..213d5984b1 100755 --- a/t/t0001-init.sh +++ b/t/t0001-init.sh @@ -861,15 +861,6 @@ test_expect_success 're-init with includeIf.onbranch condition' ' test_cmp expect actual ' -test_expect_success 're-init with includeIf.onbranch condition' ' - test_when_finished "rm -rf repo" && - git init repo && - git -c includeIf.onbranch:nonexistent.path=/does/not/exist init repo && - echo $GIT_DEFAULT_REF_FORMAT >expect && - git -C repo rev-parse --show-ref-format >actual && - test_cmp expect actual -' - test_expect_success 're-init skips non-matching includeIf.onbranch' ' test_when_finished "rm -rf repo config" && cat >config <<-EOF && -- 2.48.1.468.gbf5f394be8.dirty ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT 2025-01-30 16:24 [PATCH 0/3] setup: fix reinit of repos with different formats Patrick Steinhardt 2025-01-30 16:24 ` [PATCH 1/3] t0001: remove duplicate test Patrick Steinhardt @ 2025-01-30 16:24 ` Patrick Steinhardt 2025-01-30 22:40 ` Junio C Hamano 2025-01-31 22:38 ` Junio C Hamano 2025-01-30 16:24 ` [PATCH 3/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH Patrick Steinhardt 2025-01-30 23:58 ` [PATCH 0/3] setup: fix reinit of repos with different formats brian m. carlson 3 siblings, 2 replies; 11+ messages in thread From: Patrick Steinhardt @ 2025-01-30 16:24 UTC (permalink / raw) To: git The GIT_DEFAULT_REF_FORMAT environment variable can be set to influence the default ref format that new repostiories shall be initialized with. While this is the expected behaviour when creating a new repository, it is not when reinitializing a repository: we should retain the ref format currently used by it in that case. This doesn't work correctly right now: $ git init --ref-format=files repo Initialized empty Git repository in /tmp/repo/.git/ $ GIT_DEFAULT_REF_FORMAT=reftable git init repo fatal: could not open '/tmp/repo/.git/refs/heads' for writing: Is a directory Instead of retaining the current ref format, the reinitialization tries to reinitialize the repository with the different format. This action fails when git-init(1) tries to write the ".git/refs/heads" stub, which in the context of the reftable backend is always written as a file so that we can detect clients which inadvertently try to access the repo with the wrong ref format. Seems like the protection mechanism works for this case, as well. Fix the issue by ignoring the environment variable in case the repo has already been initialized with a ref storage format. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- setup.c | 4 +++- t/t0001-init.sh | 9 +++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/setup.c b/setup.c index 8a488f3e7c..53ffeabc5b 100644 --- a/setup.c +++ b/setup.c @@ -2534,7 +2534,9 @@ static void repository_format_configure(struct repository_format *repo_fmt, ref_format = ref_storage_format_by_name(env); if (ref_format == REF_STORAGE_FORMAT_UNKNOWN) die(_("unknown ref storage format '%s'"), env); - repo_fmt->ref_storage_format = ref_format; + if (repo_fmt->version < 0 || + repo_fmt->ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN) + repo_fmt->ref_storage_format = ref_format; } else if (cfg.ref_format != REF_STORAGE_FORMAT_UNKNOWN) { repo_fmt->ref_storage_format = cfg.ref_format; } diff --git a/t/t0001-init.sh b/t/t0001-init.sh index 213d5984b1..6dff8b75f1 100755 --- a/t/t0001-init.sh +++ b/t/t0001-init.sh @@ -697,6 +697,15 @@ do git -C refformat rev-parse --show-ref-format >actual && test_cmp expect actual ' + + test_expect_success "reinit repository with GIT_DEFAULT_REF_FORMAT=$format does not change format" ' + test_when_finished "rm -rf refformat" && + git init refformat && + git -C refformat rev-parse --show-ref-format >expect && + GIT_DEFAULT_REF_FORMAT=$format git init refformat && + git -C refformat rev-parse --show-ref-format >actual && + test_cmp expect actual + ' done test_expect_success "--ref-format= overrides GIT_DEFAULT_REF_FORMAT" ' -- 2.48.1.468.gbf5f394be8.dirty ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT 2025-01-30 16:24 ` [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT Patrick Steinhardt @ 2025-01-30 22:40 ` Junio C Hamano 2025-01-31 22:38 ` Junio C Hamano 1 sibling, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2025-01-30 22:40 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Patrick Steinhardt <ps@pks.im> writes: > The GIT_DEFAULT_REF_FORMAT environment variable can be set to influence > the default ref format that new repostiories shall be initialized with. > While this is the expected behaviour when creating a new repository, it > is not when reinitializing a repository: we should retain the ref format > currently used by it in that case. > > This doesn't work correctly right now: > > $ git init --ref-format=files repo > Initialized empty Git repository in /tmp/repo/.git/ > $ GIT_DEFAULT_REF_FORMAT=reftable git init repo > fatal: could not open '/tmp/repo/.git/refs/heads' for writing: Is a directory > > Instead of retaining the current ref format, the reinitialization tries > to reinitialize the repository with the different format. This action > fails when git-init(1) tries to write the ".git/refs/heads" stub, which > in the context of the reftable backend is always written as a file so > that we can detect clients which inadvertently try to access the repo > with the wrong ref format. Seems like the protection mechanism works for > this case, as well. Good finding. A plausible alternative behaviour could be to do the backend migration when this is asked, and we might gain consensus to do so in the (far) future, but I agree that it is a good direction to go in the short term to match the behaviour of the code to the documented expectation. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT 2025-01-30 16:24 ` [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT Patrick Steinhardt 2025-01-30 22:40 ` Junio C Hamano @ 2025-01-31 22:38 ` Junio C Hamano 2025-02-03 5:29 ` Patrick Steinhardt 1 sibling, 1 reply; 11+ messages in thread From: Junio C Hamano @ 2025-01-31 22:38 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Patrick Steinhardt <ps@pks.im> writes: > Instead of retaining the current ref format, the reinitialization tries > to reinitialize the repository with the different format. This action > fails when git-init(1) tries to write the ".git/refs/heads" stub, which > in the context of the reftable backend is always written as a file so > that we can detect clients which inadvertently try to access the repo > with the wrong ref format. Seems like the protection mechanism works for > this case, as well. > > Fix the issue by ignoring the environment variable in case the repo has > already been initialized with a ref storage format. It certainly is better than corrupting the repository, but if we are to do this change, shouldn't we at least issue a warning to tell users that (a part of) their request was ignored, instead of silently ignoring the specified ref-format? > Signed-off-by: Patrick Steinhardt <ps@pks.im> > --- > setup.c | 4 +++- > t/t0001-init.sh | 9 +++++++++ > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/setup.c b/setup.c > index 8a488f3e7c..53ffeabc5b 100644 > --- a/setup.c > +++ b/setup.c > @@ -2534,7 +2534,9 @@ static void repository_format_configure(struct repository_format *repo_fmt, > ref_format = ref_storage_format_by_name(env); > if (ref_format == REF_STORAGE_FORMAT_UNKNOWN) > die(_("unknown ref storage format '%s'"), env); > - repo_fmt->ref_storage_format = ref_format; > + if (repo_fmt->version < 0 || > + repo_fmt->ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN) > + repo_fmt->ref_storage_format = ref_format; Perhaps something silly like this? if (0 <= repo_fmt->version && repo_fmt->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) warning("ignoring the specified ref-format"); else repo_fmt->ref_storage_format = ref_format; In the longer term, we might want to consider automatically migrating the ref backend (by calling into "git ref migrate"), but it is a good first move to stop damaging the repository. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT 2025-01-31 22:38 ` Junio C Hamano @ 2025-02-03 5:29 ` Patrick Steinhardt 2025-02-03 14:01 ` Junio C Hamano 0 siblings, 1 reply; 11+ messages in thread From: Patrick Steinhardt @ 2025-02-03 5:29 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Fri, Jan 31, 2025 at 02:38:20PM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > Instead of retaining the current ref format, the reinitialization tries > > to reinitialize the repository with the different format. This action > > fails when git-init(1) tries to write the ".git/refs/heads" stub, which > > in the context of the reftable backend is always written as a file so > > that we can detect clients which inadvertently try to access the repo > > with the wrong ref format. Seems like the protection mechanism works for > > this case, as well. > > > > Fix the issue by ignoring the environment variable in case the repo has > > already been initialized with a ref storage format. > > It certainly is better than corrupting the repository, but if we are > to do this change, shouldn't we at least issue a warning to tell > users that (a part of) their request was ignored, instead of > silently ignoring the specified ref-format? I don't think we should. If this was passed on the command line then yes, we should flag this and already die indeed. But this is an environment variable that allows you to set the default format. From my point of view it is totally expected that this doesn't cause the format of existing repositories to change. > > Signed-off-by: Patrick Steinhardt <ps@pks.im> > > --- > > setup.c | 4 +++- > > t/t0001-init.sh | 9 +++++++++ > > 2 files changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/setup.c b/setup.c > > index 8a488f3e7c..53ffeabc5b 100644 > > --- a/setup.c > > +++ b/setup.c > > @@ -2534,7 +2534,9 @@ static void repository_format_configure(struct repository_format *repo_fmt, > > ref_format = ref_storage_format_by_name(env); > > if (ref_format == REF_STORAGE_FORMAT_UNKNOWN) > > die(_("unknown ref storage format '%s'"), env); > > - repo_fmt->ref_storage_format = ref_format; > > + if (repo_fmt->version < 0 || > > + repo_fmt->ref_storage_format == REF_STORAGE_FORMAT_UNKNOWN) > > + repo_fmt->ref_storage_format = ref_format; > > Perhaps something silly like this? > > if (0 <= repo_fmt->version && > repo_fmt->ref_storage_format != REF_STORAGE_FORMAT_UNKNOWN) > warning("ignoring the specified ref-format"); > else > repo_fmt->ref_storage_format = ref_format; > > In the longer term, we might want to consider automatically > migrating the ref backend (by calling into "git ref migrate"), > but it is a good first move to stop damaging the repository. I think keeping migrations explicit is worthwhile. Migrations are a somewhat risky thing, so explicitly making the user ask for them is not a bad thing. I personally wouldn't expect git-init(1) to migrate data. After all, it is supposed to initialize stuff, not rewrite it. This is doubly true for environment variables, where it is so extremely easy to accidentally still have them defined. I don't think implicitly converting every git-init(1) to do migrations would be a good idea there as it would likely do the wrong thing in many cases. So from my point of view we should treat the environment variables the same as we treat "init.defaultRefFormat" and "init.defaultObjectFormat". Those indicate defaults, but do not cause us to change the format of existing repostiories. Patrick ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT 2025-02-03 5:29 ` Patrick Steinhardt @ 2025-02-03 14:01 ` Junio C Hamano 2025-02-03 15:02 ` Patrick Steinhardt 0 siblings, 1 reply; 11+ messages in thread From: Junio C Hamano @ 2025-02-03 14:01 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Patrick Steinhardt <ps@pks.im> writes: > So from my point of view we should treat the environment variables the > same as we treat "init.defaultRefFormat" and "init.defaultObjectFormat". > Those indicate defaults, but do not cause us to change the format of > existing repostiories. Hmph, as somebody who often does things like $ GIT_EDITOR=: git do-something $ GIT_AUTHOR_NAME=foo GIT_AUTHOR_EMAIL=bar@baz git commit -a I do not necessarily see the environment variables as replacement for configured defaults. They are, at least to me, more like a single-shot override of the configured defaults, so if we were to complain and error out command line options (we do do so, don't we?), I would expect the environment variable that gives a single-shot setting to be treated the same way. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT 2025-02-03 14:01 ` Junio C Hamano @ 2025-02-03 15:02 ` Patrick Steinhardt 0 siblings, 0 replies; 11+ messages in thread From: Patrick Steinhardt @ 2025-02-03 15:02 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Mon, Feb 03, 2025 at 06:01:33AM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > So from my point of view we should treat the environment variables the > > same as we treat "init.defaultRefFormat" and "init.defaultObjectFormat". > > Those indicate defaults, but do not cause us to change the format of > > existing repostiories. > > Hmph, as somebody who often does things like > > $ GIT_EDITOR=: git do-something > $ GIT_AUTHOR_NAME=foo GIT_AUTHOR_EMAIL=bar@baz git commit -a > > I do not necessarily see the environment variables as replacement > for configured defaults. They are, at least to me, more like a > single-shot override of the configured defaults, so if we were to > complain and error out command line options (we do do so, don't we?), > I would expect the environment variable that gives a single-shot > setting to be treated the same way. Especially the second one is a good example though that works mostly as I propose: GIT_AUTHOR_NAME will impact _new_ commits, but not _existing_ ones when you for example `--amend` the commit. So this is somewhat equivalent to how both GIT_DEFAULT_REF_FORMAT and GIT_DEFAULT_HASH work with git-init(1), isn't it? Patrick ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH 2025-01-30 16:24 [PATCH 0/3] setup: fix reinit of repos with different formats Patrick Steinhardt 2025-01-30 16:24 ` [PATCH 1/3] t0001: remove duplicate test Patrick Steinhardt 2025-01-30 16:24 ` [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT Patrick Steinhardt @ 2025-01-30 16:24 ` Patrick Steinhardt 2025-01-31 22:40 ` Junio C Hamano 2025-01-30 23:58 ` [PATCH 0/3] setup: fix reinit of repos with different formats brian m. carlson 3 siblings, 1 reply; 11+ messages in thread From: Patrick Steinhardt @ 2025-01-30 16:24 UTC (permalink / raw) To: git The exact same issue as described in the preceding commit also exists for GIT_DEFAULT_HASH. Thus, reinitializing a repository that e.g. uses SHA1 with `GIT_DEFAULT_HASH=sha256 git init` will cause the object format of that repository to change to SHA256. This is of course bogus as any existing objects and refs will not be converted, thus causing repository corruption: $ git init repo Initialized empty Git repository in /tmp/repo/.git/ $ cd repo/ $ git commit --allow-empty -m message [main (root-commit) 35a7344] message $ GIT_DEFAULT_HASH=sha256 git init Reinitialized existing Git repository in /tmp/repo/.git/ $ git show fatal: your current branch appears to be broken Fix the issue by ignoring the environment variable in case the repo has already been initialized with an object hash. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- setup.c | 4 +++- t/t0001-init.sh | 12 ++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/setup.c b/setup.c index 53ffeabc5b..7da7aa8984 100644 --- a/setup.c +++ b/setup.c @@ -2517,7 +2517,9 @@ static void repository_format_configure(struct repository_format *repo_fmt, int env_algo = hash_algo_by_name(env); if (env_algo == GIT_HASH_UNKNOWN) die(_("unknown hash algorithm '%s'"), env); - repo_fmt->hash_algo = env_algo; + if (repo_fmt->version < 0 || + repo_fmt->hash_algo == GIT_HASH_UNKNOWN) + repo_fmt->hash_algo = env_algo; } else if (cfg.hash != GIT_HASH_UNKNOWN) { repo_fmt->hash_algo = cfg.hash; } diff --git a/t/t0001-init.sh b/t/t0001-init.sh index 6dff8b75f1..c49d9e0d38 100755 --- a/t/t0001-init.sh +++ b/t/t0001-init.sh @@ -586,6 +586,18 @@ test_expect_success 'GIT_DEFAULT_HASH overrides init.defaultObjectFormat' ' echo sha256 >expected ' +for hash in sha1 sha256 +do + test_expect_success "reinit repository with GIT_DEFAULT_HASH=$hash does not change format" ' + test_when_finished "rm -rf repo" && + git init repo && + git -C repo rev-parse --show-object-format >expect && + GIT_DEFAULT_HASH=$hash git init repo && + git -C repo rev-parse --show-object-format >actual && + test_cmp expect actual + ' +done + test_expect_success 'extensions.objectFormat is not allowed with repo version 0' ' test_when_finished "rm -rf explicit-v0" && git init --object-format=sha256 explicit-v0 && -- 2.48.1.468.gbf5f394be8.dirty ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH 2025-01-30 16:24 ` [PATCH 3/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH Patrick Steinhardt @ 2025-01-31 22:40 ` Junio C Hamano 0 siblings, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2025-01-31 22:40 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git Patrick Steinhardt <ps@pks.im> writes: > The exact same issue as described in the preceding commit also exists > for GIT_DEFAULT_HASH. Thus, reinitializing a repository that e.g. uses > SHA1 with `GIT_DEFAULT_HASH=sha256 git init` will cause the object > format of that repository to change to SHA256. This is of course bogus > as any existing objects and refs will not be converted, thus causing > repository corruption: The exact same comment on silently ignoring applies, but unlike the previous one, converting the entire history is far more expensive, so it may be much less likely to be worth going the automatic conversion route than the previous one. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/3] setup: fix reinit of repos with different formats 2025-01-30 16:24 [PATCH 0/3] setup: fix reinit of repos with different formats Patrick Steinhardt ` (2 preceding siblings ...) 2025-01-30 16:24 ` [PATCH 3/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH Patrick Steinhardt @ 2025-01-30 23:58 ` brian m. carlson 3 siblings, 0 replies; 11+ messages in thread From: brian m. carlson @ 2025-01-30 23:58 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: git [-- Attachment #1: Type: text/plain, Size: 675 bytes --] On 2025-01-30 at 16:24:16, Patrick Steinhardt wrote: > Hi, > > this issue with the reinitialization of the ref format was recently > discovered in our CI systems at GitLab, where caches contained repos > with one ref format but we tried to reinit them with a different ref > format. But turns out that the same issue also exists for the object > format, so this patch series fixes both issues. I looked at this series and it seems reasonable. As you mentioned, reinitializing the repository in this way cannot possibly work in any meaningful sense, so rejecting it is the prudent thing to do. -- brian m. carlson (they/them or he/him) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-02-03 15:02 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-30 16:24 [PATCH 0/3] setup: fix reinit of repos with different formats Patrick Steinhardt 2025-01-30 16:24 ` [PATCH 1/3] t0001: remove duplicate test Patrick Steinhardt 2025-01-30 16:24 ` [PATCH 2/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_REF_FORMAT Patrick Steinhardt 2025-01-30 22:40 ` Junio C Hamano 2025-01-31 22:38 ` Junio C Hamano 2025-02-03 5:29 ` Patrick Steinhardt 2025-02-03 14:01 ` Junio C Hamano 2025-02-03 15:02 ` Patrick Steinhardt 2025-01-30 16:24 ` [PATCH 3/3] setup: fix reinit of repos with incompatible GIT_DEFAULT_HASH Patrick Steinhardt 2025-01-31 22:40 ` Junio C Hamano 2025-01-30 23:58 ` [PATCH 0/3] setup: fix reinit of repos with different formats brian m. carlson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).