All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, me@ttaylorr.com,
	Derrick Stolee <derrickstolee@github.com>
Subject: Re: [PATCH] clone: ignore invalid local refs in remote
Date: Mon, 18 Apr 2022 19:30:35 +0200	[thread overview]
Message-ID: <220418.86zgki8ffb.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <pull.1214.git.1650301959803.gitgitgadget@gmail.com>


On Mon, Apr 18 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
>
> When cloning directly from a local repository, we load a list of refs
> based on scanning the $GIT_DIR/refs/ directory of the "server"
> repository. If files exist in that directory that do not parse as
> hexadecimal hashes, then the ref array used by write_remote_refs()
> ends up with some entries with null OIDs. This causes us to hit a BUG()
> statement in ref_transaction_create():
>
>   BUG: create called without valid new_oid
>
> This BUG() call used to be a die() until 033abf97f (Replace all
> die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
> was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
> valid, 2015-02-17).
>
> The original report for this bug [1] mentioned that this problem did not
> exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
> (refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
> GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
> able to successfully compile and test the Git codebase.
>
> [1] https://github.com/git-for-windows/git/issues/3781
>
> There are two approaches to consider here. One would be to remove this
> BUG() statement in favor of returning with an error. There are only two
> callers to ref_transaction_create(), so this would have a limited
> impact.
>
> The other approach would be to add special casing in 'git clone' to
> avoid this faulty input to the method.
>
> While I originally started with changing 'git clone', I decided that
> modifying ref_transaction_create() was a more complete solution. This
> prevents failing with a BUG() statement when we already have a good way
> to report an error (including a reason for that error) within the
> method. Both callers properly check the return value and die() with the
> error message, so this is an appropriate direction.
>
> The added test helps check against a regression, but does check that our
> intended error message is handled correctly.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>     clone: ignore invalid local refs in remote
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1214%2Fderrickstolee%2Frefs-bug-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1214/derrickstolee/refs-bug-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1214
>
>  refs.c                 | 6 ++++--
>  t/t5605-clone-local.sh | 9 +++++++++
>  2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 1a964505f92..f300f83e4d4 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1111,8 +1111,10 @@ int ref_transaction_create(struct ref_transaction *transaction,
>  			   unsigned int flags, const char *msg,
>  			   struct strbuf *err)
>  {
> -	if (!new_oid || is_null_oid(new_oid))
> -		BUG("create called without valid new_oid");
> +	if (!new_oid || is_null_oid(new_oid)) {
> +		strbuf_addf(err, "'%s' has a null OID", refname);
> +		return 1;
> +	}
>  	return ref_transaction_update(transaction, refname, new_oid,
>  				      null_oid(), flags, msg, err);
>  }
> diff --git a/t/t5605-clone-local.sh b/t/t5605-clone-local.sh
> index 7d63365f93a..21ab6192839 100755
> --- a/t/t5605-clone-local.sh
> +++ b/t/t5605-clone-local.sh
> @@ -141,4 +141,13 @@ test_expect_success 'cloning locally respects "-u" for fetching refs' '
>  	test_must_fail git clone --bare -u false a should_not_work.git
>  '
>  
> +test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
> +	git init corrupt &&
> +	test_commit -C corrupt one &&
> +	echo a >corrupt/.git/refs/heads/topic &&
> +
> +	test_must_fail git clone corrupt working 2>err &&
> +	grep "has a null OID" err
> +'
> +
>  test_done
>
> base-commit: 11cfe552610386954886543f5de87dcc49ad5735

If I change this test to clone from "$PWD/corrupt" instead we get as far
as dying in "not our ref" in upload-pack", after the "client" said "want
0000....".

I don't see anything wrong with this narrow fix, but I wonder how close
we are to just doing something useful here for the user. I.e. if we
can't parse_object() what we're about to "copy" to e.g. error() on that,
but carry on cloning the rest if possible.

We should probably always return a non-zero at the end, suggesting the
user omit the bad ref with a refspec. But if we did all that we could
usefully recover partially corrupt-repos in this state...

  reply	other threads:[~2022-04-18 17:33 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-18 17:12 [PATCH] clone: ignore invalid local refs in remote Derrick Stolee via GitGitGadget
2022-04-18 17:30 ` Ævar Arnfjörð Bjarmason [this message]
2022-04-20 20:53 ` Junio C Hamano
2022-04-21 13:29   ` Derrick Stolee
2022-04-21 18:00     ` Junio C Hamano
2022-04-25 13:47 ` [PATCH v2] clone: die() instead of BUG() on bad refs Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220418.86zgki8ffb.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.