git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: Philip Peterson via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org,
	Philip Peterson <philip.c.peterson@gmail.com>
Subject: Re: [PATCH 2/2] apply: do not use the_repository
Date: Tue, 28 May 2024 19:22:43 +0200	[thread overview]
Message-ID: <ZlYS4zHmjKrzFtF6@ncase> (raw)
In-Reply-To: <xmqqh6eh7v7g.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 3165 bytes --]

On Tue, May 28, 2024 at 09:33:07AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > On Tue, May 28, 2024 at 06:32:01AM +0000, Philip Peterson via GitGitGadget wrote:
> >> diff --git a/apply.c b/apply.c
> >> index 901b67e6255..364c05fbd06 100644
> >> --- a/apply.c
> >> +++ b/apply.c
> >> @@ -3218,13 +3218,13 @@ static int apply_binary(struct apply_state *state,
> >>  		return 0; /* deletion patch */
> >>  	}
> >>  
> >> -	if (has_object(the_repository, &oid, 0)) {
> >> +	if (has_object(state->repo, &oid, 0)) {
> >>  		/* We already have the postimage */
> >>  		enum object_type type;
> >>  		unsigned long size;
> >>  		char *result;
> >>  
> >> -		result = repo_read_object_file(the_repository, &oid, &type,
> >> +		result = repo_read_object_file(state->repo, &oid, &type,
> >>  					       &size);
> >>  		if (!result)
> >>  			return error(_("the necessary postimage %s for "
> >
> > We call `get_oid_hex()` in this function, which ultimately ends up
> > accessing `the_repository` via `the_hash_algo`. We should likely convert
> > those to `repo_get_oid()`.
> >
> > There are also other accesses to `the_hash_algo` in this function, which
> > should be converted to use `state->repo->hash_algo`, as well.
> 
> We as a more experienced developers should come up with a way to
> help developers who are less familiar with the API set, so that they
> can chip in this effort to wean ourselves off of globals.
> 
> Would a bug like the ones you pointed out be easily caught by say
> running "GIT_TEST_DEFAULT_HASH=sha256 make test"?

No, I don't think so.

I was also thinking about different approaches to this problem overall.
Ideally, I would want to catch this on the programmatic level so that we
do not even have to detect this at runtime. And I think this should be
feasible by introducing something similar to the USE_THE_INDEX_VARIABLE
macro, which we have recently removed.

If we had a USE_THE_REPOSITORY_VARIABLE macro that guards declarations
of:

  - `the_repository`

  - `the_hash_algo`

  - Functions that implicitly rely on either of the above.

Then you could prove that a given code unit does not rely on any of the
above anymore by not declaring that macro.

In fact, these patches prompted me to give this a go this morning. And
while the changes are of course trivial by themselves, I quickly
discovered that they are also pointless because we almost always rely on
either of the above.

The most important reason of this is "hash.h", where `struct object_id`
falls back to `the_hash_algo` in case its own hash algorithm wasn't set.
Ideally, this would never be the case. But a quick test shows that there
are many places where we do rely on the fallback value, mostly around
null OIDs. Fixing this would be a necessary prerequisite.

Another important benefit of the macro would be that we stop working
against a moving target. New files shouldn't declare it, and once a file
has been refactored and the corresponding macro was removed we would
know that we don't add new dependencies on either of the above by
accident.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2024-05-28 17:22 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-28  6:31 [PATCH 0/2] Remove some usages of the_repository Philip Peterson via GitGitGadget
2024-05-28  6:32 ` [PATCH 1/2] add-patch: do not use the_repository Philip Peterson via GitGitGadget
2024-05-28 21:41   ` Junio C Hamano
2024-05-28  6:32 ` [PATCH 2/2] apply: " Philip Peterson via GitGitGadget
2024-05-28  7:28   ` Patrick Steinhardt
2024-05-28 16:33     ` Junio C Hamano
2024-05-28 17:22       ` Patrick Steinhardt [this message]
2024-05-28 17:37         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZlYS4zHmjKrzFtF6@ncase \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=philip.c.peterson@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).