From: "Eric W. Biederman" <ebiederm@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: <git@vger.kernel.org>, "brian m. carlson" <sandals@crustytoothpaste.net>
Subject: Re: [PATCH] setup: Only allow extenions.objectFormat to be specified once
Date: Wed, 27 Sep 2023 08:11:05 -0500 [thread overview]
Message-ID: <87r0mjn4ly.fsf@gmail.froward.int.ebiederm.org> (raw)
In-Reply-To: <xmqqr0mkmx9b.fsf@gitster.g> (Junio C. Hamano's message of "Tue, 26 Sep 2023 14:37:36 -0700")
Junio C Hamano <gitster@pobox.com> writes:
> "Eric W. Biederman" <ebiederm@gmail.com> writes:
>
>> Today there is no sanity checking of what happens when
>> extensions.objectFormat is specified multiple times. Catch confused git
>> configurations by only allowing this option to be specified once.
>
> Hmph. I am not sure if this is worth doing, and especially only for
> "objectformat". Do we intend to apply different rules other than
> "you can give it only once" to other extensions, and if so where
> will these rules be catalogued? I do not see particular harm to let
> them follow the usual "last one wins".
>
> If the patch were about trying to make sure that extensions, which
> are inherentaly per-repository, appear only in $GIT_DIR/config and
> complain if the code gets confused and tried to read them from the
> system or global configuration files, I would understand, and
> strongly support such an effort, ithough.
Unless I misread something, the code only checks for [extensions]
in $GIT_DIR/config.
> The real sanity we want to enforce is that what is reported by
> running "git config extensions.objectformat" must match the object
> format that is used in refs and object database.
Agreed. Allowing git config extensions.objectformat to change the
existing value is allowing the repository to be corrupted.
> Manually futzing
> the configuration file and adding an entry with a contradictory
> value certainly is one way to break that sanity, and this patch may
> catch such a breakage, but once we start worrying about manually
> futzing the configuration file, the check added here would easily
> miss if the futzing is done by replacing instead of adding, so I am
> not sure if this extra code is worth its bits.
>
> But perhaps I am missing something and not seeing why it is worth
> insisting on "last one is the first one" for this particular one.
I somewhat have blinders on. There are 3 configuration options I am
concerned with:
extensions.objectFormat
extensions.compatObjectFormat
core.historicObjectFormat (or whatever name we settle on).
One key concern I heard expressed in earlier reviews is that however we
handle these options we handle them in such a way as to give ourselves
room to rise to challenges in the future.
Whatever we do with parsing we have the following logical
constraints:
For extensions.objectFormat: There can only be a single storage hash.
For extensions.compatObjectFormat: There can be no compatibility hash,
there can be a single compatibility hash, and depending how things go
between now and the next hash function transition we might want multiple
compatibility hashes.
For core.historicObjectFormat: There can be no historic hash function, there
can be a single historic hash function, there can be multiple historic
hash functions.
For the compatibility hash I think it is unlikely we will want to
support more than one compatibility hash in practice but I can imagine
a scenario where we just get into the transition from SHA-1 to SHA-256
and a serious break is discovered that requires switching to FutureHash
ASAP.
For historic object formats like SHA-1 will become post transition there
are references embedded in commit comments, email messages, bug
trackers. All kinds of places that we can not update so there is
fundamentally a need to be able to find which current objects correspond
to the historic names. For a project each hash function transition will
create more such objects.
When I looked I saw two ways within current git to specify a list of
values for a single configuration option.
- Give that option multiple times.
- Parse the option value in such a way as to generate a list.
It is my sense just specifying the compatObjectFormat multiple times to
specify multiple compatibility object formats makes the most sense.
Especially as all is needed today is to only allow a single value.
After I had implemented the only allow once logic for compatObjectFormat
I saw that objectFormat had nothing similar, and knowing it is a bug
for multiple objectFormat wrote a patch to enforce only appear once
for objectFormat as well.
For objectFormat I don't care very much. For compatObjectFormat I truly
care, and for even more for the option that allows finding the current
object from a historic oid (even a truncated one) I care very much.
For me the fundamental question is if we allow multiples compatibility
hashes or historical hashes how do we specify them? Have the option
appear more than once? A comma separated list?
Whatever we decided I want to enforce that doesn't appear in current
configurations so we can support for multiples later.
Eric
>> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
>> ---
>> setup.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/setup.c b/setup.c
>> index 18927a847b86..ef9f79b8885e 100644
>> --- a/setup.c
>> +++ b/setup.c
>> @@ -580,6 +580,7 @@ static enum extension_result handle_extension(const char *var,
>> if (!strcmp(ext, "noop-v1")) {
>> return EXTENSION_OK;
>> } else if (!strcmp(ext, "objectformat")) {
>> + struct string_list_item *item;
>> int format;
>>
>> if (!value)
>> @@ -588,6 +589,13 @@ static enum extension_result handle_extension(const char *var,
>> if (format == GIT_HASH_UNKNOWN)
>> return error(_("invalid value for '%s': '%s'"),
>> "extensions.objectformat", value);
>> + /* Only support objectFormat being specified once. */
>> + for_each_string_list_item(item, &data->v1_only_extensions) {
>> + if (!strcmp(item->string, "objectformat"))
>> + return error(_("'%s' already specified as '%s'"),
>> + "extensions.objectformat",
>> + hash_algos[data->hash_algo].name);
>> + }
>> data->hash_algo = format;
>> return EXTENSION_OK;
>> }
next prev parent reply other threads:[~2023-09-27 13:11 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-26 16:01 [PATCH] setup: Only allow extenions.objectFormat to be specified once Eric W. Biederman
2023-09-26 21:37 ` Junio C Hamano
2023-09-27 13:11 ` Eric W. Biederman [this message]
2023-09-27 19:56 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r0mjn4ly.fsf@gmail.froward.int.ebiederm.org \
--to=ebiederm@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).