All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: Philip Peterson via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org,
	Philip Peterson <philip.c.peterson@gmail.com>
Subject: Re: [PATCH 2/2] apply: do not use the_repository
Date: Tue, 28 May 2024 19:22:43 +0200	[thread overview]
Message-ID: <ZlYS4zHmjKrzFtF6@ncase> (raw)
In-Reply-To: <xmqqh6eh7v7g.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 3165 bytes --]

On Tue, May 28, 2024 at 09:33:07AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > On Tue, May 28, 2024 at 06:32:01AM +0000, Philip Peterson via GitGitGadget wrote:
> >> diff --git a/apply.c b/apply.c
> >> index 901b67e6255..364c05fbd06 100644
> >> --- a/apply.c
> >> +++ b/apply.c
> >> @@ -3218,13 +3218,13 @@ static int apply_binary(struct apply_state *state,
> >>  		return 0; /* deletion patch */
> >>  	}
> >>  
> >> -	if (has_object(the_repository, &oid, 0)) {
> >> +	if (has_object(state->repo, &oid, 0)) {
> >>  		/* We already have the postimage */
> >>  		enum object_type type;
> >>  		unsigned long size;
> >>  		char *result;
> >>  
> >> -		result = repo_read_object_file(the_repository, &oid, &type,
> >> +		result = repo_read_object_file(state->repo, &oid, &type,
> >>  					       &size);
> >>  		if (!result)
> >>  			return error(_("the necessary postimage %s for "
> >
> > We call `get_oid_hex()` in this function, which ultimately ends up
> > accessing `the_repository` via `the_hash_algo`. We should likely convert
> > those to `repo_get_oid()`.
> >
> > There are also other accesses to `the_hash_algo` in this function, which
> > should be converted to use `state->repo->hash_algo`, as well.
> 
> We as a more experienced developers should come up with a way to
> help developers who are less familiar with the API set, so that they
> can chip in this effort to wean ourselves off of globals.
> 
> Would a bug like the ones you pointed out be easily caught by say
> running "GIT_TEST_DEFAULT_HASH=sha256 make test"?

No, I don't think so.

I was also thinking about different approaches to this problem overall.
Ideally, I would want to catch this on the programmatic level so that we
do not even have to detect this at runtime. And I think this should be
feasible by introducing something similar to the USE_THE_INDEX_VARIABLE
macro, which we have recently removed.

If we had a USE_THE_REPOSITORY_VARIABLE macro that guards declarations
of:

  - `the_repository`

  - `the_hash_algo`

  - Functions that implicitly rely on either of the above.

Then you could prove that a given code unit does not rely on any of the
above anymore by not declaring that macro.

In fact, these patches prompted me to give this a go this morning. And
while the changes are of course trivial by themselves, I quickly
discovered that they are also pointless because we almost always rely on
either of the above.

The most important reason of this is "hash.h", where `struct object_id`
falls back to `the_hash_algo` in case its own hash algorithm wasn't set.
Ideally, this would never be the case. But a quick test shows that there
are many places where we do rely on the fallback value, mostly around
null OIDs. Fixing this would be a necessary prerequisite.

Another important benefit of the macro would be that we stop working
against a moving target. New files shouldn't declare it, and once a file
has been refactored and the corresponding macro was removed we would
know that we don't add new dependencies on either of the above by
accident.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-05-28 17:22 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-28  6:31 [PATCH 0/2] Remove some usages of the_repository Philip Peterson via GitGitGadget
2024-05-28  6:32 ` [PATCH 1/2] add-patch: do not use the_repository Philip Peterson via GitGitGadget
2024-05-28 21:41   ` Junio C Hamano
2024-05-28  6:32 ` [PATCH 2/2] apply: " Philip Peterson via GitGitGadget
2024-05-28  7:28   ` Patrick Steinhardt
2024-05-28 16:33     ` Junio C Hamano
2024-05-28 17:22       ` Patrick Steinhardt [this message]
2024-05-28 17:37         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlYS4zHmjKrzFtF6@ncase \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=philip.c.peterson@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.