git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "SZEDER Gábor" <szeder.dev@gmail.com>,
	"Jeff King" <peff@peff.net>,
	git@vger.kernel.org, "H . Merijn Brand" <h.m.brand@xs4all.nl>,
	"Harald Nordgren" <haraldnordgren@gmail.com>,
	"Olga Telezhnaia" <olyatelezhnaya@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH] ref-filter: don't look for objects when outside of a repository
Date: Wed, 14 Nov 2018 13:27:25 +0100	[thread overview]
Message-ID: <20181114122725.18659-1-szeder.dev@gmail.com> (raw)
In-Reply-To: <xmqq5zytpa65.fsf@gitster-ct.c.googlers.com>

The command 'git ls-remote --sort=authordate <remote>' segfaults when
run outside of a repository, ever since the introduction of its
'--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
2018-04-09).

While in general the 'git ls-remote' command can be run outside of a
repository just fine, its '--sort=<key>' option with certain keys does
require access to the referenced objects.  This sorting is implemented
using the generic ref-filter sorting facility, which already handles
missing objects gracefully with the appropriate 'missing object
deadbeef for HEAD' message.  However, being generic means that it
checks replace refs while trying to retrieve an object, and while
doing so it accesses the 'git_replace_ref_base' variable, which has
not been initialized and is still a NULL pointer when outside of a
repository, thus causing the segfault.

Make ref-filter more careful upfront while parsing the format string,
and make it error out when encountering a format atom requiring object
access when we are not in a repository.  Also add a test to ensure
that 'git ls-remote --sort' fails gracefully when executed outside of
a repository.

Reported-by: H.Merijn Brand <h.m.brand@xs4all.nl>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---

On Tue, Sep 25, 2018 at 01:57:38PM -0700, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> 
> > However, if we go for a more informative error message, then wouldn't
> > it be better to add this condition in populate_value() before it even
> > calls get_object()?  Then we could also add the problematic format
> > specifier to the error message (I think, but didn't actually check),
> > just in case someone specified multiple sort keys.
> 
> Even though I suspect that verify_ref_format() is the logically the
> right place to do this (after all, it is about seeing if the format
> makes sense, and a format that requires an object access used
> outside a repository should trigger an verification error), doing
> that in populate_value() probably strikes the best balance, I would
> think.

We are dealing with format specifiers used for sorting here, and those
don't go through verify_ref_format().

So how about this patch instead?

I think it will catch all cases where a user would try to use a format
specifier, for any purpose, requiring object access outside of a
repository (though I don't know whether there are any other cases
besides 'git ls-remote --sort=...'; but perhaps in the future
'ls-remote' will get a '--format' option as well), and it does so
before performing a potentially expensive query to the remote.  OTOH,
it won't change the documented "missing object" error message when run
inside a repo but the necessary object is indeed missing.


 ref-filter.c         | 4 ++++
 t/t5512-ls-remote.sh | 6 ++++++
 2 files changed, 10 insertions(+)

diff --git a/ref-filter.c b/ref-filter.c
index 0c45ed9d94..a1290659af 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -534,6 +534,10 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	if (ARRAY_SIZE(valid_atom) <= i)
 		return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
 				       (int)(ep-atom), atom);
+	if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
+		return strbuf_addf_ret(err, -1,
+				       _("not a git repository, but the field '%.*s' requires access to object data"),
+				       (int)(ep-atom), atom);
 
 	/* Add it in, including the deref prefix */
 	at = used_atom_cnt;
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 91ee6841c1..32e722db2e 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -302,6 +302,12 @@ test_expect_success 'ls-remote works outside repository' '
 	nongit git ls-remote dst.git
 '
 
+test_expect_success 'ls-remote --sort fails gracefully outside repository' '
+	# Use a sort key that requires access to the referenced objects.
+	nongit test_must_fail git ls-remote --sort=authordate "$TRASH_DIRECTORY" 2>err &&
+	test_i18ngrep "^fatal: not a git repository, but the field '\''authordate'\'' requires access to object data" err
+'
+
 test_expect_success 'ls-remote patterns work with all protocol versions' '
 	git for-each-ref --format="%(objectname)	%(refname)" \
 		refs/heads/master refs/remotes/origin/master >expect &&
-- 
2.19.1.1182.gbfcc7ed3e6


  reply	other threads:[~2018-11-14 12:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-22 10:42 Coredump on ls-remote + --sort H.Merijn Brand
2018-09-22 12:33 ` Ævar Arnfjörð Bjarmason
2018-09-22 14:11 ` [PATCH] ref-filter: don't look for objects when outside of a repository SZEDER Gábor
2018-09-24 16:15   ` Junio C Hamano
2018-09-24 18:17   ` Jeff King
2018-09-24 21:20     ` SZEDER Gábor
2018-09-24 21:30       ` Jeff King
2018-09-25 20:57       ` Junio C Hamano
2018-11-14 12:27         ` SZEDER Gábor [this message]
2018-11-15  9:38           ` Jeff King
2018-11-15  9:43             ` Jeff King
2018-11-16  5:09               ` Junio C Hamano
2018-11-16  8:56                 ` Jeff King
2018-11-16 10:07                   ` Junio C Hamano
2018-11-16 13:16                 ` SZEDER Gábor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181114122725.18659-1-szeder.dev@gmail.com \
    --to=szeder.dev@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=h.m.brand@xs4all.nl \
    --cc=haraldnordgren@gmail.com \
    --cc=olyatelezhnaya@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).