git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Wolfgang Müller" <wolf@oriole.systems>
To: git@vger.kernel.org
Cc: "Wolfgang Müller" <wolf@oriole.systems>
Subject: [RFC PATCH v2] builtin/shortlog: explicitly set hash algo when there is no repo
Date: Tue, 15 Oct 2024 13:48:26 +0200	[thread overview]
Message-ID: <20241015114826.715158-1-wolf@oriole.systems> (raw)
In-Reply-To: <20241011183445.229228-1-wolf@oriole.systems>

Whilst git-shortlog(1) does not explicitly need any repository
information when run without reference to one, it still parses some of
its arguments with parse_revision_opt() which assumes that the hash
algorithm is set. However, in c8aed5e8da (repository: stop setting SHA1
as the default object hash, 2024-05-07) we stopped setting up a default
hash algorithm and instead require commands to set it up explicitly.

This was done for most other commands like in ab274909d4 (builtin/diff:
explicitly set hash algo when there is no repo, 2024-05-07) but was
missed for builtin/shortlog, making git-shortlog(1) segfault outside of
a repository when given arguments like --author that trigger a call to
parse_revision_opt().

Fix this for now by explicitly setting the hash algorithm to SHA1. Also
add a regression test for the segfault.

Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
---
 builtin/shortlog.c  | 12 ++++++++++++
 t/t4201-shortlog.sh |  5 +++++
 2 files changed, 17 insertions(+)

diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 3ed5c46078..0fa35202ed 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -387,6 +387,18 @@ int cmd_shortlog(int argc,
 	struct rev_info rev;
 	int nongit = !startup_info->have_repository;
 
+	/*
+	 * NEEDSWORK: Later on we'll call parse_revision_opt which relies on
+	 * the hash algorithm being set but since we are operating outside of a
+	 * Git repository we cannot determine one. This is only needed because
+	 * parse_revision_opt expects hexsz for --abbrev which is irrelevant
+	 * for shortlog outside of a git repository. For now explicitly set
+	 * SHA1, but ideally the parsing machinery would be split between
+	 * git/nongit so that we do not have to do this.
+	 */
+	if (nongit && !the_hash_algo)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	const struct option options[] = {
 		OPT_BIT('c', "committer", &log.groups,
 			N_("group by committer rather than author"),
diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
index c20c885724..ed39c67ba1 100755
--- a/t/t4201-shortlog.sh
+++ b/t/t4201-shortlog.sh
@@ -143,6 +143,11 @@ fuzz()
 	test_grep "too many arguments" out
 '
 
+test_expect_success 'shortlog --author from non-git directory does not segfault' '
+	git log --no-expand-tabs HEAD >log &&
+	env GIT_DIR=non-existing git shortlog --author=author <log 2>out
+'
+
 test_expect_success 'shortlog should add newline when input line matches wraplen' '
 	cat >expect <<\EOF &&
 A U Thor (2):

Range-diff against v1:
1:  42516cc02d ! 1:  d3047a0291 builtin/shortlog: explicitly set hash algo when there is no repo
    @@ Commit message
         a repository when given arguments like --author that trigger a call to
         parse_revision_opt().
     
    -    Fix this for now by explicitly setting the hash algorithm to SHA1.
    +    Fix this for now by explicitly setting the hash algorithm to SHA1. Also
    +    add a regression test for the segfault.
     
         Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
     
    @@ builtin/shortlog.c: int cmd_shortlog(int argc,
      	int nongit = !startup_info->have_repository;
      
     +	/*
    -+	 * Later on we'll call parse_revision_opt which relies on the hash
    -+	 * algorithm being set but since we are operating outside of a Git
    -+	 * repository we cannot determine one. For now default to SHA1.
    ++	 * NEEDSWORK: Later on we'll call parse_revision_opt which relies on
    ++	 * the hash algorithm being set but since we are operating outside of a
    ++	 * Git repository we cannot determine one. This is only needed because
    ++	 * parse_revision_opt expects hexsz for --abbrev which is irrelevant
    ++	 * for shortlog outside of a git repository. For now explicitly set
    ++	 * SHA1, but ideally the parsing machinery would be split between
    ++	 * git/nongit so that we do not have to do this.
     +	 */
     +	if (nongit && !the_hash_algo)
     +		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
    @@ builtin/shortlog.c: int cmd_shortlog(int argc,
      	const struct option options[] = {
      		OPT_BIT('c', "committer", &log.groups,
      			N_("group by committer rather than author"),
    +
    + ## t/t4201-shortlog.sh ##
    +@@ t/t4201-shortlog.sh: fuzz()
    + 	test_grep "too many arguments" out
    + '
    + 
    ++test_expect_success 'shortlog --author from non-git directory does not segfault' '
    ++	git log --no-expand-tabs HEAD >log &&
    ++	env GIT_DIR=non-existing git shortlog --author=author <log 2>out
    ++'
    ++
    + test_expect_success 'shortlog should add newline when input line matches wraplen' '
    + 	cat >expect <<\EOF &&
    + A U Thor (2):
-- 
2.47.0


  parent reply	other threads:[~2024-10-15 11:48 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-11 18:34 [RFC PATCH] builtin/shortlog: explicitly set hash algo when there is no repo Wolfgang Müller
2024-10-15  9:33 ` Wolfgang Müller
2024-10-15  9:47   ` Kristoffer Haugsbakk
2024-10-15 19:54   ` Taylor Blau
2024-10-15 23:28     ` Taylor Blau
2024-10-16  8:15       ` Wolfgang Müller
2024-10-16 18:28         ` Taylor Blau
2024-10-15 11:48 ` Wolfgang Müller [this message]
2024-10-15 17:20   ` [RFC PATCH v2] " Eric Sunshine
2024-10-15 17:51     ` Wolfgang Müller
2024-10-16  5:32   ` Patrick Steinhardt
2024-10-16  8:47     ` Wolfgang Müller
2024-10-16  8:57       ` Patrick Steinhardt
2024-10-16  9:07         ` Wolfgang Müller
2024-10-16 18:52           ` Taylor Blau
2024-10-16 19:01             ` Wolfgang Müller
2024-10-17  5:04             ` Patrick Steinhardt
2024-10-16  9:48         ` Wolfgang Müller
2024-10-16 19:01           ` Taylor Blau
2024-10-16 19:14             ` Wolfgang Müller
2024-10-16 18:21 ` [PATCH v3 0/2] " Wolfgang Müller
2024-10-16 18:21   ` [PATCH v3 1/2] " Wolfgang Müller
2024-10-16 19:22     ` Taylor Blau
2024-10-16 19:37       ` Wolfgang Müller
2024-10-17 11:58       ` Patrick Steinhardt
2024-10-17 12:09         ` Wolfgang Müller
2024-10-17 12:11           ` Patrick Steinhardt
2024-10-16 18:21   ` [PATCH v3 2/2] shortlog: Test reading a log from a SHA256 repo in a non-git directory Wolfgang Müller
2024-10-16 19:25     ` Taylor Blau
2024-10-16 19:35       ` Wolfgang Müller
2024-10-16 19:45         ` Taylor Blau
2024-10-16 19:32   ` [PATCH v3 0/2] builtin/shortlog: explicitly set hash algo when there is no repo Taylor Blau
2024-10-16 19:38     ` Wolfgang Müller
2024-10-17  9:35 ` [PATCH v4] " Wolfgang Müller
2024-10-17 20:10   ` Taylor Blau
2024-10-17 22:02     ` Wolfgang Müller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241015114826.715158-1-wolf@oriole.systems \
    --to=wolf@oriole.systems \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).