public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Cc: git@vger.kernel.org, joliss42@gmail.com, joliss@gmail.com,
	gitster@pobox.com
Subject: Re: [PATCH v2] refspec: safely parse refspecs outside a repository
Date: Sat, 21 Mar 2026 23:53:38 -0400	[thread overview]
Message-ID: <20260322035338.GA819989@coredump.intra.peff.net> (raw)
In-Reply-To: <20260322023557.15907-1-jayatheerthkulkarni2005@gmail.com>

On Sun, Mar 22, 2026 at 08:05:57AM +0530, K Jayatheerth wrote:

> Fix this by ensuring `the_hash_algo` is non-NULL before checking
> `the_hash_algo->hexsz` for both standard and negative refspecs.
> When operating outside a repository, fetching is impossible,
> so bypassing the exact OID check is the cleanest approach.

This argument is glossing over some details. Trying to break down all of
the implications, I think we have:

  - Without knowing the hash algo, we cannot reject negative refspecs
    that look like oids. This is OK in practice for two reasons. One,
    the only commands which apply refspecs are fetch and push, and they
    require a repository. And two, while we miss an opportunity to
    complain about broken config, it is quite unlikely for somebody to
    have such config (a global-level configured negative refspec that
    looks like an oid). And they will be told about it when running an
    actual fetch anyway.

  - Without knowing the hash algo, we cannot mark refspecs with the
    exact_sha1 flag. Again, we are not actually applying any refspecs
    unless we have a repo. The exact_sha1 flag is used to influence the
    set of prefixes we send to a remote v2 upload-pack process, but
    only for fetch (which requires a repository). For ls-remote, which
    can run outside a repo, we don't even look at the refspecs.

And so for those reasons it's probably OK to quietly ignore things.
Still, it rubs me the wrong way a little that we might create a subtle
bug from some other caller.

If we think we don't care about refspecs, it kind of makes me wonder if
we ought to be able to tell the remote API that we are interested in
remotes for their URLs only, and _not_ for their refspecs. But maybe
that leads to madness, as we end up with half-initialized "struct
remote"s floating around our process.


The other thing I wondered is why we are talking about remote-curl here,
and not ls-remote. And that's because ls-remote already hacked around
this!

Check out 9e89dcb66a (builtin/ls-remote: fall back to SHA1 outside of a
repo, 2024-08-02), which adds this:

          /*
           * TODO: This is buggy, but required for transport helpers. When a
           * transport helper advertises a "refspec", then we'd add that to a
           * list of refspecs via `refspec_append()`, which transitively depends
           * on `the_hash_algo`. Thus, when the hash algorithm isn't properly set
           * up, this would lead to a segfault.
           *
           * We really should fix this in the transport helper logic such that we
           * lazily parse refspec capabilities _after_ we have learned about the
           * remote's object format. Otherwise, we may end up misparsing refspecs
           * depending on what object hash the remote uses.
           */
          if (!the_repository->hash_algo)
                  repo_set_hash_algo(the_repository, GIT_HASH_DEFAULT);

Obviously that is kicking the can down the road, but it kind of makes
sense that we would have the same hack in place for remote-curl (which
in practice is only going to be called out-of-repo by ls-remote anyway).
It is only the fact that it happens in a separate process that the
existing fix from 9e89dcb66a is not helping us.

> Additionally, while looking into the remote-curl execution path,
> take the opportunity to remove an unused `#include "git-curl-compat.h"`
> from `remote-curl.c`.

I doubt this is correct.

remote-curl checks GIT_CURL_NEED_TRANSFER_ENCODING_HEADER, which is
defined in git-curl-compat.h. It may work fine without that header if
you have a recent version of curl, but older systems would be subtly
broken.

> +test_expect_success 'ls-remote outside repo does not segfault with fetch refspec' '
> +	GIT_CEILING_DIRECTORIES=$(pwd) &&
> +	export GIT_CEILING_DIRECTORIES &&
> +	mkdir nongit &&
> +	(
> +		cd nongit &&
> +		env GIT_CONFIG_NOSYSTEM=1 \
> +			GIT_CONFIG_GLOBAL=/dev/null \
> +			GIT_CONFIG_COUNT=1 \
> +			GIT_CONFIG_KEY_0=remote.origin.fetch \
> +			GIT_CONFIG_VALUE_0="+refs/tags/*:refs/tags/*" \
> +			git ls-remote "$HTTPD_URL/smart/repo.git"
> +	)
> +'

Some of this is irrelevant to reproducing the bug (like redirecting
system and global config). And it is much easier to use "git -c" to set
temporary config.

We also have a "nongit" helper function already. So I think just:

   nongit git \
          -c remote.origin.fetch=anything \
          ls-remote "$HTTPD_URL/smart/repo.git"

is enough to trigger it. Possibly it is slightly more realistic to
actually use the remote whose refspecs we are configuring:

  nongit git \
         -c remote.origin.url="$HTTPD_URL/smart/repo.git" \
	 -c remote.origin.fetch=anything \
	 ls-remote origin

but as the bug exists now, either is sufficient to trigger it. You could
also add a negative refspec if you want to test that half of the change.

-Peff

  parent reply	other threads:[~2026-03-22  3:53 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-21 19:11 remote-curl: segfault parsing remote.<name>.fetch outside a repository Jo Liss
2026-03-21 19:46 ` [PATCH] remote-curl: set fallback hash algorithm outside repo K Jayatheerth
2026-03-21 23:09   ` brian m. carlson
2026-03-22  2:35   ` [PATCH v2] refspec: safely parse refspecs outside a repository K Jayatheerth
2026-03-22  3:31     ` Junio C Hamano
2026-03-22  3:53     ` Jeff King [this message]
2026-03-22  5:36     ` [PATCH v3 1/2] " K Jayatheerth
2026-03-22  5:36       ` [PATCH v3 2/2] refspec: fix typo in comment K Jayatheerth
2026-03-23 22:27       ` [PATCH v3 1/2] refspec: safely parse refspecs outside a repository Junio C Hamano
2026-03-23 23:10         ` Jeff King
2026-03-23 23:39           ` Junio C Hamano
2026-03-24  1:57     ` [PATCH v4 1/2] remote-curl: fall back to default hash outside repo K Jayatheerth
2026-03-24  1:57       ` [PATCH v4 2/2] refspec: fix typo in comment K Jayatheerth
2026-03-24  4:25       ` [PATCH v4 1/2] remote-curl: fall back to default hash outside repo Junio C Hamano
2026-03-21 21:06 ` remote-curl: segfault parsing remote.<name>.fetch outside a repository Jeff King
2026-03-22  1:20   ` Junio C Hamano
2026-03-22  1:37     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260322035338.GA819989@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jayatheerthkulkarni2005@gmail.com \
    --cc=joliss42@gmail.com \
    --cc=joliss@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox