Git development
 help / color / mirror / Atom feed
* t7900 fails with recent debian systemd?
From: Jeff King @ 2023-12-06 22:31 UTC (permalink / raw)
  To: git

I noticed t7900 failing today. The failure looks like this:

  $ ./t7900-maintenance.sh -v -i -x
  [...]
  + systemd-analyze verify systemd/user/git-maintenance@hourly.service
  Unit git-maintenance@hourly.service not found.
  error: last command exited with $?=1
  not ok 36 - start and stop Linux/systemd maintenance

The problem started after upgrading my Debian unstable system to the
systemd 255~rc4-2 deb. Downgrading back to 254.5-1 makes the test pass
again.

I'm sure it's something silly with finding paths in XDG_CONFIG_HOME or
something like that. I haven't dug further, but I thought I'd post this
to save somebody else going through the same initial debugging. (And of
course any wisdom or further debugging is greatly appreciated).

-Peff

^ permalink raw reply

* Re: t7900 fails with recent debian systemd?
From: Jeff King @ 2023-12-06 22:36 UTC (permalink / raw)
  To: git
In-Reply-To: <20231206223145.GA638844@coredump.intra.peff.net>

On Wed, Dec 06, 2023 at 05:31:45PM -0500, Jeff King wrote:

> I noticed t7900 failing today. The failure looks like this:
> 
>   $ ./t7900-maintenance.sh -v -i -x
>   [...]
>   + systemd-analyze verify systemd/user/git-maintenance@hourly.service
>   Unit git-maintenance@hourly.service not found.
>   error: last command exited with $?=1
>   not ok 36 - start and stop Linux/systemd maintenance
> 
> The problem started after upgrading my Debian unstable system to the
> systemd 255~rc4-2 deb. Downgrading back to 254.5-1 makes the test pass
> again.
> 
> I'm sure it's something silly with finding paths in XDG_CONFIG_HOME or
> something like that. I haven't dug further, but I thought I'd post this
> to save somebody else going through the same initial debugging. (And of
> course any wisdom or further debugging is greatly appreciated).

After stracing, it is indeed looking for:

  trash directory.t7900-maintenance/systemd/user/git-maintenance@hourly.service

but that file doesn't exist. We installed git-maintenance@hourly.timer,
and git-maintenance@.service. Is the latter supposed to be a wildcard of
some kind? Maybe the rules changed. I don't really know anything about
systemd.

-Peff

^ permalink raw reply

* Re: t7900 fails with recent debian systemd?
From: Eric Sunshine @ 2023-12-06 22:40 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20231206223612.GA650770@coredump.intra.peff.net>

On Wed, Dec 6, 2023 at 5:36 PM Jeff King <peff@peff.net> wrote:
> After stracing, it is indeed looking for:
>
>   trash directory.t7900-maintenance/systemd/user/git-maintenance@hourly.service
>
> but that file doesn't exist. We installed git-maintenance@hourly.timer,
> and git-maintenance@.service. Is the latter supposed to be a wildcard of
> some kind? Maybe the rules changed. I don't really know anything about
> systemd.

Apparently, that's intentional. From builtin/gc.c:

    /*
     * No matter the schedule, we use the same service and can make
     * use of the templating system. When installing
     * git-maintenance@<schedule>.timer, systemd will notice that
     * git-maintenance@.service exists as a template and will use this
     * file and insert the <schedule> into the template at the
     * position of "%i".
     */
    static int systemd_timer_write_service_template(const char *exec_path)
    {
        char *local_service_name = xstrfmt(SYSTEMD_UNIT_FORMAT, "", "service");

I'm not sure why the comment is talking about "%i", though.

^ permalink raw reply

* What's the recommendation for forgetting all rerere's records?
From: Sean Allred @ 2023-12-06 22:37 UTC (permalink / raw)
  To: git

Hi all,

When outside the context of a conflict (no rebase/merge in progress),
what's the intended workflow to clear out the contents of
$GIT_DIR/rr-cache?

We have developers who'd like to discard their faulty resolutions after
completing a rebase gone awry (but not aborted). There doesn't seem to
be a recommendation in git-rerere(1) for how to deal with this
situation. (git-rerere-forget seems to only work in the context of an
active conflict -- and is documented as such.)

I'm wary of recommending `rm -rf "$(git rev-parse --git-dir)/rr-cache"`
-- it's hard to describe this as anything but hacky when the rest of the
experience is handled in porcelain.

Thanks!

--
Sean Allred

^ permalink raw reply

* Re: git switch has fatal dependency on default fetch config
From: Jeremiah Steele (Jerry) @ 2023-12-07  2:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20231206191148.GA103708@coredump.intra.peff.net>

For me, the main convenience of upstream tracking is for "git status" ahead/behind info, which requires you to have already fetched changes from the remote. For my "normal" work, I have a small set of branches set as default fetch refs. When I need to work on a branch not in that set, I fetch that specific branch. (I'm not interested in downloading and creating hundreds of branches when I only need one.)

The requirement to reverse-engineer from the *default* fetch refspec seems like an odd implementation choice and a major shortcoming for storing tracking configuration. The vast majority of the time, origin/branch will map to refs/heads/branch on the remote anyhow. When I have a command that sets up tracking configuration, I would expect to be able to specify the remote refspec somehow, rather than having a hard dependency on a configuration value for default behavior. I feel like the following sequence should work as expected (with a limited default fetch spec), but there are several things that go wrong:

% git fetch origin feature # should fetch to refs/remotes/origin/feature
% git switch feature # should find origin/feature, create a local branch with the same name, and set up tracking

Oh well.

Thanks for the response and the explanation.

> On Dec 6, 2023, at 11:11 AM, Jeff King <peff@peff.net> wrote:
> 
> On Tue, Dec 05, 2023 at 07:41:28PM -0800, Jeremiah Steele (Jerry) wrote:
> 
>> Changing the default fetch refspec for a remote breaks git switch:
>> 
>> % git branch -r
>>  origin/HEAD -> origin/master
>>  origin/feature
>>  origin/master
>> % git remote set-branches origin master
>> % git switch -c feature --track origin/feature
>> fatal: cannot set up tracking information; starting point 'origin/feature' is not a branch
>> % git remote set-branches --add origin feature
>> % git switch -c feature --track origin/feature
>> branch 'feature' set up to track 'origin/feature'.
>> Switched to a new branch 'feature'
>> 
>> It seems like I should be able to fetch a remote branch and track it
>> without having to monkey around with my default fetch config. Is there
>> a reason git switch has a hard dependency on the default remote fetch
>> refspec configuration?
> 
> I think it's required by the form of the tracking config, which is
> defined by a named remote and a remote branch, like:
> 
>  [branch "foo"]
>  remote = origin
>  merge = refs/heads/foo
> 
> You explicitly asked to track "origin/feature", which means that Git has
> to be able to turn that into config as above. It can handle two cases:
> 
>  1. It's a local branch in refs/heads/, in which case the remote is "."
>     and the "merge" field is the name of the branch.
> 
>  2. It's a ref that can be found by reversing a fetch refspec. With the
>     default remote.origin.fetch refspec of refs/heads/*:refs/remotes/origin/*,
>     we know that "refs/remotes/origin/feature" comes from "refs/heads/feature"
>     on the "origin" remote.
> 
>     But since you used "remote set-branches" to limit that refspec, we
>     can't do that same reversal. And "origin/feature", while we do
>     still have it as a ref, does not correspond to any remote ref we'd
>     fetch. So we don't know how to set up the tracking config.
> 
> The notion of "tracking" really came about as defining what "git pull"
> does. And without a remote that fetches, "git pull" does not really make
> much sense.
> 
> I'd guess that you are more interested in being able to just use
> @{upstream}, especially as the default for things like rebase, etc. And
> that could work without being able to actually fetch the ref. But those
> two things are intertwined in Git. You obviously can still base a branch
> off of "origin/feature", but you'll have to specify it manually when
> doing rebase, etc.
> 
> -Peff


^ permalink raw reply

* [PATCH 0/1] MyFirstContribution: configure shallow threads for git format-patch
From: Stan Hu @ 2023-12-07  5:46 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Stan Hu

This documents the preference for using shallow threads and how
to configure a project to use them.

Stan Hu (1):
  MyFirstContribution: configure shallow threads for git format-patch

 Documentation/MyFirstContribution.txt | 11 +++++++++++
 1 file changed, 11 insertions(+)

-- 
2.42.0


^ permalink raw reply

* [PATCH 1/1] MyFirstContribution: configure shallow threads for git format-patch
From: Stan Hu @ 2023-12-07  5:46 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Stan Hu
In-Reply-To: <cover.1701927372.git.stanhu@gmail.com>

Most users on the Git mailing list prefers to use of shallow threads,
but this preference was not documented. This commit adds a blurb about
how to configure this globally via `git config`.

Signed-off-by: Stan Hu <stanhu@gmail.com>
---
 Documentation/MyFirstContribution.txt | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/MyFirstContribution.txt b/Documentation/MyFirstContribution.txt
index 7cfed60c2e..3ac7455b1d 100644
--- a/Documentation/MyFirstContribution.txt
+++ b/Documentation/MyFirstContribution.txt
@@ -1004,6 +1004,17 @@ determine the right way to configure it to use your SMTP server; again, as this
 configuration can change significantly based on your system and email setup, it
 is out of scope for the context of this tutorial.
 
+[[configure-shallow-threads]]
+=== Configuring shallow threads for `git format-patch`
+
+It is common practice on the Git mailing list to use "shallow" threads,
+where every mail is a reply to the first mail of the series. You can
+configure the default threading style of `git format-patch` via:
+
+-----
+$ git config format.thread shallow
+----
+
 [[format-patch]]
 === Preparing Initial Patchset
 
-- 
2.42.0


^ permalink raw reply related

* [PATCH v2 0/2] completion: refactor and support reftables backend
From: Stan Hu @ 2023-12-07  6:06 UTC (permalink / raw)
  To: git; +Cc: Stan Hu
In-Reply-To: <20231130202404.89791-1-stanhu@gmail.com>

This patch series addresses the review feedback:

1. Renames the '__git_ref_exists' helper to '__git_pseudoref_exists'.
2. cleans up the commit messages to refer to pseudorefs instead of 'special refs'.

Stan Hu (2):
  completion: refactor existence checks for pseudorefs
  completion: support pseudoref existence checks for reftables

 contrib/completion/git-completion.bash | 43 +++++++++++++++++++++++---
 1 file changed, 38 insertions(+), 5 deletions(-)

-- 
2.42.0


^ permalink raw reply

* [PATCH v2 1/2] completion: refactor existence checks for pseudorefs
From: Stan Hu @ 2023-12-07  6:06 UTC (permalink / raw)
  To: git; +Cc: Stan Hu
In-Reply-To: <cover.1701928891.git.stanhu@gmail.com>

In preparation for the reftable backend, this commit introduces a
'__git_pseudoref_exists' function that continues to use 'test -f' to
determine whether a given pseudoref exists in the local filesystem.

Signed-off-by: Stan Hu <stanhu@gmail.com>
---
 contrib/completion/git-completion.bash | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 13a39ebd2e..9fbdc13f9a 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -122,6 +122,15 @@ __git ()
 		${__git_dir:+--git-dir="$__git_dir"} "$@" 2>/dev/null
 }
 
+# Runs git in $__git_repo_path to determine whether a ref exists.
+# 1: The ref to search
+__git_ref_exists ()
+{
+	local ref=$1
+
+	[ -f "$__git_repo_path/$ref" ]
+}
+
 # Removes backslash escaping, single quotes and double quotes from a word,
 # stores the result in the variable $dequoted_word.
 # 1: The word to dequote.
@@ -1625,7 +1634,7 @@ __git_cherry_pick_inprogress_options=$__git_sequencer_inprogress_options
 _git_cherry_pick ()
 {
 	__git_find_repo_path
-	if [ -f "$__git_repo_path"/CHERRY_PICK_HEAD ]; then
+	if __git_ref_exists CHERRY_PICK_HEAD; then
 		__gitcomp "$__git_cherry_pick_inprogress_options"
 		return
 	fi
@@ -2067,7 +2076,7 @@ _git_log ()
 	__git_find_repo_path
 
 	local merge=""
-	if [ -f "$__git_repo_path/MERGE_HEAD" ]; then
+	if __git_ref_exists MERGE_HEAD; then
 		merge="--merge"
 	fi
 	case "$prev,$cur" in
@@ -2934,6 +2943,7 @@ _git_reset ()
 
 _git_restore ()
 {
+	__git_find_repo_path
 	case "$prev" in
 	-s)
 		__git_complete_refs
@@ -2952,7 +2962,7 @@ _git_restore ()
 		__gitcomp_builtin restore
 		;;
 	*)
-		if __git rev-parse --verify --quiet HEAD >/dev/null; then
+		if __git_ref_exists HEAD; then
 			__git_complete_index_file "--modified"
 		fi
 	esac
@@ -2963,7 +2973,7 @@ __git_revert_inprogress_options=$__git_sequencer_inprogress_options
 _git_revert ()
 {
 	__git_find_repo_path
-	if [ -f "$__git_repo_path"/REVERT_HEAD ]; then
+	if __git_ref_exists REVERT_HEAD; then
 		__gitcomp "$__git_revert_inprogress_options"
 		return
 	fi
@@ -3592,7 +3602,7 @@ __gitk_main ()
 	__git_find_repo_path
 
 	local merge=""
-	if [ -f "$__git_repo_path/MERGE_HEAD" ]; then
+	if __git_ref_exists MERGE_HEAD; then
 		merge="--merge"
 	fi
 	case "$cur" in
-- 
2.42.0


^ permalink raw reply related

* [PATCH v2 2/2] completion: support pseudoref existence checks for reftables
From: Stan Hu @ 2023-12-07  6:06 UTC (permalink / raw)
  To: git; +Cc: Stan Hu
In-Reply-To: <cover.1701928891.git.stanhu@gmail.com>

In contrib/completion/git-completion.bash, there are a bunch of
instances where we read pseudorefs, such as HEAD, MERGE_HEAD,
REVERT_HEAD, and others via the filesystem. However, the upcoming
reftable refs backend won't use '.git/HEAD' at all but instead will
write an invalid refname as placeholder for backwards compatibility,
which will break the git-completion script.

Update the '__git_pseudoref_exists' function to:

1. Recognize the placeholder '.git/HEAD' written by the reftable
   backend (its content is specified in the reftable specs).
2. If reftable is in use, use 'git rev-parse' to determine whether the
    given ref exists.
3. Otherwise, continue to use 'test -f' to check for the ref's filename.

Signed-off-by: Stan Hu <stanhu@gmail.com>
---
 contrib/completion/git-completion.bash | 39 ++++++++++++++++++++------
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 9fbdc13f9a..b2e407c7e7 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -122,12 +122,35 @@ __git ()
 		${__git_dir:+--git-dir="$__git_dir"} "$@" 2>/dev/null
 }
 
-# Runs git in $__git_repo_path to determine whether a ref exists.
-# 1: The ref to search
-__git_ref_exists ()
+# Helper function to read the first line of a file into a variable.
+# __git_eread requires 2 arguments, the file path and the name of the
+# variable, in that order.
+#
+# This is taken from git-prompt.sh.
+__git_eread ()
+{
+	test -r "$1" && IFS=$'\r\n' read -r "$2" <"$1"
+}
+
+# Runs git in $__git_repo_path to determine whether a pseudoref exists.
+# 1: The pseudo-ref to search
+__git_pseudoref_exists()
 {
 	local ref=$1
 
+	# If the reftable is in use, we have to shell out to 'git rev-parse'
+	# to determine whether the ref exists instead of looking directly in
+	# the filesystem to determine whether the ref exists. Otherwise, use
+	# Bash builtins since executing Git commands are expensive on some
+	# platforms.
+	if __git_eread "$__git_repo_path/HEAD" head; then
+		b="${head#ref: }"
+		if [ "$b" == "refs/heads/.invalid" ]; then
+			__git -C "$__git_repo_path" rev-parse --verify --quiet "$ref" 2>/dev/null
+			return $?
+		fi
+	fi
+
 	[ -f "$__git_repo_path/$ref" ]
 }
 
@@ -1634,7 +1657,7 @@ __git_cherry_pick_inprogress_options=$__git_sequencer_inprogress_options
 _git_cherry_pick ()
 {
 	__git_find_repo_path
-	if __git_ref_exists CHERRY_PICK_HEAD; then
+	if __git_pseudoref_exists CHERRY_PICK_HEAD; then
 		__gitcomp "$__git_cherry_pick_inprogress_options"
 		return
 	fi
@@ -2076,7 +2099,7 @@ _git_log ()
 	__git_find_repo_path
 
 	local merge=""
-	if __git_ref_exists MERGE_HEAD; then
+	if __git_pseudoref_exists MERGE_HEAD; then
 		merge="--merge"
 	fi
 	case "$prev,$cur" in
@@ -2962,7 +2985,7 @@ _git_restore ()
 		__gitcomp_builtin restore
 		;;
 	*)
-		if __git_ref_exists HEAD; then
+		if __git_pseudoref_exists HEAD; then
 			__git_complete_index_file "--modified"
 		fi
 	esac
@@ -2973,7 +2996,7 @@ __git_revert_inprogress_options=$__git_sequencer_inprogress_options
 _git_revert ()
 {
 	__git_find_repo_path
-	if __git_ref_exists REVERT_HEAD; then
+	if __git_pseudoref_exists REVERT_HEAD; then
 		__gitcomp "$__git_revert_inprogress_options"
 		return
 	fi
@@ -3602,7 +3625,7 @@ __gitk_main ()
 	__git_find_repo_path
 
 	local merge=""
-	if __git_ref_exists MERGE_HEAD; then
+	if __git_pseudoref_exists MERGE_HEAD; then
 		merge="--merge"
 	fi
 	case "$cur" in
-- 
2.42.0


^ permalink raw reply related

* Re: t7900 fails with recent debian systemd?
From: Jeff King @ 2023-12-07  6:27 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git
In-Reply-To: <CAPig+cT-vCraf2tfA3t3Rh6mLNTr0rB5mApmz0vu2MCRvssaVw@mail.gmail.com>

On Wed, Dec 06, 2023 at 05:40:59PM -0500, Eric Sunshine wrote:

> On Wed, Dec 6, 2023 at 5:36 PM Jeff King <peff@peff.net> wrote:
> > After stracing, it is indeed looking for:
> >
> >   trash directory.t7900-maintenance/systemd/user/git-maintenance@hourly.service
> >
> > but that file doesn't exist. We installed git-maintenance@hourly.timer,
> > and git-maintenance@.service. Is the latter supposed to be a wildcard of
> > some kind? Maybe the rules changed. I don't really know anything about
> > systemd.
> 
> Apparently, that's intentional. From builtin/gc.c:
> 
>     /*
>      * No matter the schedule, we use the same service and can make
>      * use of the templating system. When installing
>      * git-maintenance@<schedule>.timer, systemd will notice that
>      * git-maintenance@.service exists as a template and will use this
>      * file and insert the <schedule> into the template at the
>      * position of "%i".
>      */
>     static int systemd_timer_write_service_template(const char *exec_path)
>     {
>         char *local_service_name = xstrfmt(SYSTEMD_UNIT_FORMAT, "", "service");

OK, that makes sense, and this is indeed the way to do template files in
systemd.

> I'm not sure why the comment is talking about "%i", though.

The "%i" replacement is done on the contents of the template file
itself. So inside git-maintenance@.service, the ExecStart line specifies
that we should run "git maintenance run --schedule=%i".

This looks like a regression in systemd itself. I was able to bisect and
I left a comment there:

  https://github.com/systemd/systemd/pull/30172#issuecomment-1844699620

I don't really see a way to work around it within our test suite, short
of just skipping the "systemd-analyze verify" calls entirely. Hopefully
my analysis is right and they will fix it soon, and we can consider it
"not our problem" on this end. :)

-Peff

^ permalink raw reply

* Re: [PATCH] object-name: reject too-deep recursive ancestor queries
From: Patrick Steinhardt @ 2023-12-07  6:52 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Taylor Blau, git,
	Carlos Andrés Ramírez Cataño
In-Reply-To: <20231206194035.GB103708@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 2500 bytes --]

On Wed, Dec 06, 2023 at 02:40:35PM -0500, Jeff King wrote:
> On Fri, Nov 24, 2023 at 11:11:53AM +0100, Patrick Steinhardt wrote:
> 
> > > When we get "HEAD~~~~~~~~~^2~~~~~~" from the user, do we somehow try
> > > to create a file or a directory with that name and fail due to
> > > ENAMETOOLONG?
> > 
> > Sorry, this was a typo on my part. I didn't mean "revision", I meant
> > "reference" here. References are limited to at most 4kB on most
> > platforms due to filesystem limitations, whereas revisions currently
> > have no limits in place.
> 
> Even without filesystem limitations, references are effectively limited
> to 64kb due to the pkt-line format.
> 
> Revisions can be much longer than a reference, though. We accept
> "some_ref:some/path/in/tree", for instance[1].  I think you could argue
> that paths are likewise limited by the filesystem, though. Even on
> systems like Linux where paths can grow arbitrarily long (by descending
> and adding to the current directory), you're still limited in specifying
> a full pathname. And Git will always use the full path from the project
> root when creating worktree entries. Plus my recent tree-depth patches
> effectively limit us to 16MB in the default config.

I was only able to trigger these issues with _really_ long revisions,
like hundreds of megabytes. But that's on a glibc system, other systems
based on e.g. musl libc have a much smaller stack by default where these
limits would be hit sooner.

> So I think it might be reasonable to limit revision lengths just as a
> belt-and-suspenders against overflow attacks, etc. But I suspect that
> the limits we'd choose there might not match what we'd want for
> protection against stack exhaustion via recursion. E.g., I think 8k is
> probably the minimum I'd want for a revision ("my/4k/ref:my/4k/path").
> If one "~" character can create an expensive recursion, that might be
> too much.

Fair enough. I think combining the two approaches would be sensible as a
defense-in-depth approach.

Patrick

> So we probably need something like Taylor's patch anyway (or to switch
> to an iterative algorithm, though that might be tricky because of the
> way we parse). I agree it needs to handle "^", though.
> 
> -Peff
> 
> [1] There are other more exotic revisions, too. The most arbitrary-sized
>     that comes to mind is ":/some-string-to-match". I doubt anybody
>     would be too mad if that were limited to 8k or even 4k, though.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH] bisect: always clean on reset
From: Jeff King @ 2023-12-07  6:53 UTC (permalink / raw)
  To: Janik Haag; +Cc: git
In-Reply-To: <20231113210820.GB2028092@coredump.intra.peff.net>

On Mon, Nov 13, 2023 at 04:08:20PM -0500, Jeff King wrote:

> So really, you just want the "clean" part of "bisect reset", but not the
> "reset". We could have a separate "bisect clean" that would do what you
> want (clean state without trying to reset HEAD). But I don't think it
> would be unreasonable to "reset" to just unconditionally clean. I think
> it would probably just be a few lines in bisect_reset() to avoid the
> early return, and skip the call to "checkout" when we have no branch to
> go back to.
> 
> Maybe a good simple patch for somebody interested in getting into Git
> development?

I didn't want to forget about this, so I rolled it up into a patch.

-- >8 --
Subject: [PATCH] bisect: always clean on reset

Usually "bisect reset" cleans up any refs/bisect/ refs, along with
meta-files like .git/BISECT_LOG. But it only does so after deciding that
a bisection is active, which it does by reading BISECT_START. This is
usually fine, but it's possible to get into a confusing state if the
BISECT_START file is gone, but other cruft is left (this might be due to
a bug, or a system crash, etc).

And since "bisect reset" refuses to do anything in this state, the user
has no easy way to clean up the leftover cruft. While another "bisect
start" would clear the state, in the interim it can be annoying, as
other tools (like our bash prompt code) think we are bisecting, and
for-each-ref output may be polluted with refs/bisect/ entries.

Further adding to the confusion is that running "bisect reset $some_ref"
skips the BISECT_START check. So it never realizes that there's no
bisection active and does the cleanup anyway!

So let's just make sure we always do the cleanup, whether we looked at
BISECT_START or not. If the user doesn't give us a commit to reset to,
we'll still say "We are not bisecting" and skip the call to "git
checkout".

Reported-by: Janik Haag <janik@aq0.de>
Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/bisect.c            | 9 ++++-----
 t/t6030-bisect-porcelain.sh | 6 ++++++
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/builtin/bisect.c b/builtin/bisect.c
index 35938b05fd..c5565686bf 100644
--- a/builtin/bisect.c
+++ b/builtin/bisect.c
@@ -233,11 +233,10 @@ static int bisect_reset(const char *commit)
 	struct strbuf branch = STRBUF_INIT;
 
 	if (!commit) {
-		if (strbuf_read_file(&branch, git_path_bisect_start(), 0) < 1) {
+		if (!strbuf_read_file(&branch, git_path_bisect_start(), 0))
 			printf(_("We are not bisecting.\n"));
-			return 0;
-		}
-		strbuf_rtrim(&branch);
+		else
+			strbuf_rtrim(&branch);
 	} else {
 		struct object_id oid;
 
@@ -246,7 +245,7 @@ static int bisect_reset(const char *commit)
 		strbuf_addstr(&branch, commit);
 	}
 
-	if (!ref_exists("BISECT_HEAD")) {
+	if (branch.len && !ref_exists("BISECT_HEAD")) {
 		struct child_process cmd = CHILD_PROCESS_INIT;
 
 		cmd.git_cmd = 1;
diff --git a/t/t6030-bisect-porcelain.sh b/t/t6030-bisect-porcelain.sh
index 2a5b7d8379..7b24d1684e 100755
--- a/t/t6030-bisect-porcelain.sh
+++ b/t/t6030-bisect-porcelain.sh
@@ -170,6 +170,12 @@ test_expect_success 'bisect reset when not bisecting' '
 	cmp branch.expect branch.output
 '
 
+test_expect_success 'bisect reset cleans up even when not bisecting' '
+	echo garbage >.git/BISECT_LOG &&
+	git bisect reset &&
+	test_path_is_missing .git/BISECT_LOG
+'
+
 test_expect_success 'bisect reset removes packed refs' '
 	git bisect reset &&
 	git bisect start &&
-- 
2.43.0.664.ga12c899002


^ permalink raw reply related

* Re: [PATCH] setup: recognize bare repositories with packed-refs
From: Patrick Steinhardt @ 2023-12-07  7:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Adam Majer, git
In-Reply-To: <20231206201048.GE103708@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 5150 bytes --]

On Wed, Dec 06, 2023 at 03:10:48PM -0500, Jeff King wrote:
> On Wed, Nov 29, 2023 at 11:13:18AM +0100, Patrick Steinhardt wrote:
> 
> > As I'm currently working on the reftable backend this thought has also
> > crossed my mind. The reftable backend doesn't only create "refs/", but
> > it also creates "HEAD" with contents "ref: refs/heads/.invalid" so that
> > Git commands recognize the Git directory properly. Longer-term I would
> > really love to see us doing a better job of detecting Git repositories
> > so that we don't have to carry this legacy baggage around.
> > 
> > I can see different ways for how to do this:
> > 
> >     - Either we iterate through all known reference backends, asking
> >       each of them whether they recognize the directory as something
> >       they understand.
> > 
> >     - Or we start parsing the gitconfig of the repository so that we can
> >       learn about which reference backend to expect, and then ask that
> >       specific backend whether it thinks that the directory indeed looks
> >       like something it can handle.
> > 
> > I'd personally prefer the latter, but I'm not sure whether we really
> > want to try and parse any file that happens to be called "config".
> 
> We do eventually parse the config file to pick up repositoryFormatVersion.
> But there's sort of a chicken-and-egg here where we only do so after
> gaining some confidence that it's a repo directory. :)
> 
> I actually think the "ask each backend if it looks plausible" is
> reasonable, at least for an implementation that knows about all
> backends. Though what gives me pause is how older versions of Git will
> behave with a new-format repository that does not have a "refs"
> directory.
> 
> There are really two compatibility checks. In is_git_directory(), we
> want to say "is this a repo or not". And then later we parse the config,
> make sure the repository format is OK, and that we support all
> extensions. So right now, an older version of Git that encounters a
> reftable-formatted repo (that has a vestigial "refs/" directory) says
> "ah, that is a repo, but I don't understand it" (the latter because
> presumably the repo version/extensions in .git/config are values it
> doesn't know about). But if we get rid of "refs/", then older versions
> of Git will stop even considering it as a repo at all, and will keep
> searching up to the ceiling directory. So either:
> 
>   1. They'll find nothing, and you'll get "you're not in a git repo",
>      rather than "you're in a git repo, but I don't understand it".
>      Which is slightly worse.
> 
>   2. They'll find some _other_ containing repo. Which could be quite
>      confusing.
> 
> So forgetting at all about how we structure the code, it seems to me
> that the problem is not new code, but all of the existing code which
> looks for access("refs", X_OK).

True. The question is of course how much value there is in an old tool
to be able to discover a new repository that it wouldn't be able to read
in the first place due to it not understanding the reference format. So
I'd very much like to see that eventually, we're able to get rid of
"legacy" cruft that doesn't serve any purpose anymore.

The question is whether we can do a better job of this going forward so
that at least we don't have to pose the same question in the future.
Right now, we'll face the same problem whenever any part of the current
on-disk repository data structures changes.

I wonder whether it would make sense to introduce something like a
filesystem-level hint, e.g. in the form of a new ".is-git-repository"
file. If Git discovers that file then it assumes the directory to be a
Git repository -- and everything else is set up by parsing the config
and thus the repository's configured format.

> I dunno. Maybe that is being too paranoid about backwards compatibility.
> People will have to turn on reftable manually, at least for a while, and
> would hopefully know what they are signing up for, and that old versions
> might not work as well. And by the time a new format becomes the
> default, it's possible that those older versions would have become quite
> rare.
> 
> > Just throwing this out there, but we could use this as an excuse to
> > introduce "extensions.refFormat". If it's explicitly configured to be
> > "reffiles" then we accept repositories even if they don't have the
> > "refs/" directory or a "packed-refs" file. This would still require work
> > in alternative implementations of Git, but this work will need to happen
> > anyway when the reftable backend lands.
> > 
> > I'd personally love for this extension to be introduced before I'm
> > sending the reftable backend upstream so that we can have discussions
> > around it beforehand.
> 
> We already have an extension config option to specify that we're using
> reftable, don't we? But anything in config has the same chicken-and-egg
> problems as above, I think.

Not yet, no. I plan to submit the new "extensions.refFormat" extension
soonish though, probably next week.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 4/7] revision, rev-parse: factorize incompatibility messages about --exclude-hidden
From: Patrick Steinhardt @ 2023-12-07  7:10 UTC (permalink / raw)
  To: Taylor Blau; +Cc: René Scharfe, git
In-Reply-To: <ZXDKjdOiIdHipaKy@nand.local>

[-- Attachment #1: Type: text/plain, Size: 1328 bytes --]

On Wed, Dec 06, 2023 at 02:25:01PM -0500, Taylor Blau wrote:
> On Wed, Dec 06, 2023 at 06:07:29PM +0100, René Scharfe wrote:
> > > It's not perfect
> > > of course, but would at least ensure that we can easily convert things
> > > over time without having to duplicate the exact message everywhere.
> >
> > Maybe the simplest option would be to use a macro, e.g.
> >
> >    #define INCOMPATIBLE_OPTIONS_MESSAGE \
> >            _("options '%s' and '%s' cannot be used together")
> >
> > It could be used with both error() and die(), and the compiler would
> > still ensure that two strings are passed along with it, but I don't know
> > how to encode that requirement in the macro name somehow to make it
> > self-documenting.  Perhaps by getting the number two in there?
> 
> I think that this is a great idea. It nicely solves Patrick's concern
> that we have to duplicate this message ID everywhere, and equally solves
> yours by calling error() inline instead of having to pass down the
> option values.
> 
> I think that including a number in the macro name would be helpful here.

Does our i18n tooling know how to extract such messages defined in
macros? I have to admit I don't really know how it works under the hood.
But if it does work then this looks like a good solution to me.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH 0/7] fix segfaults with implicit-bool config
From: Jeff King @ 2023-12-07  7:10 UTC (permalink / raw)
  To: git; +Cc: Carlos Andrés Ramírez Cataño

Carlos reported to the security list a case where you can cause Git
to segfault by using an implicit bool like:

  [core]
  someVariable

when the parsing side for core.someVariable does not correctly check a
NULL "value" string. This is mostly harmless, as anybody who can feed
arbitrary config can already execute arbitrary code. There is one case
of this when parsing .gitmodules (which we don't trust), but even there
I don't think the security implications are that interesting. A
malicious repo can get "clone --recurse-submodules" to segfault, but
always with a strict NULL dereference, not any kind of controllable
pointer. See patch 5 for more details.

I audited the whole code base for instances of the problem. It was
fairly manual, so it's possible I missed a spot, but I think this should
cover everything.

The first patch has vanilla cases, and the rest are instances where I
thought it was worth calling out specific details.

  [1/7]: config: handle NULL value when parsing non-bools
  [2/7]: setup: handle NULL value when parsing extensions
  [3/7]: trace2: handle NULL values in tr2_sysenv config callback
  [4/7]: help: handle NULL value for alias.* config
  [5/7]: submodule: handle NULL value when parsing submodule.*.branch
  [6/7]: trailer: handle NULL value when parsing trailer-specific config
  [7/7]: fsck: handle NULL value when parsing message config

 builtin/blame.c        |  2 ++
 builtin/checkout.c     |  2 ++
 builtin/clone.c        |  2 ++
 builtin/log.c          |  5 ++++-
 builtin/pack-objects.c |  6 +++++-
 builtin/receive-pack.c | 11 +++++++----
 compat/mingw.c         |  2 ++
 config.c               |  8 ++++++++
 diff.c                 | 19 ++++++++++++++++---
 fetch-pack.c           | 12 ++++++++----
 fsck.c                 |  8 ++++++--
 help.c                 |  5 ++++-
 mailinfo.c             |  2 ++
 notes-utils.c          |  2 ++
 setup.c                |  2 ++
 submodule-config.c     |  4 +++-
 trace2/tr2_sysenv.c    |  2 ++
 trailer.c              |  8 ++++++++
 18 files changed, 85 insertions(+), 17 deletions(-)

^ permalink raw reply

* [PATCH 1/7] config: handle NULL value when parsing non-bools
From: Jeff King @ 2023-12-07  7:11 UTC (permalink / raw)
  To: git; +Cc: Carlos Andrés Ramírez Cataño
In-Reply-To: <20231207071030.GA1275835@coredump.intra.peff.net>

When the config parser sees an "implicit" bool like:

  [core]
  someVariable

it passes NULL to the config callback. Any callback code which expects a
string must check for NULL. This usually happens via helpers like
git_config_string(), etc, but some custom code forgets to do so and will
segfault.

These are all fairly vanilla cases where the solution is just the usual
pattern of:

  if (!value)
        return config_error_nonbool(var);

though note that in a few cases we have to split initializers like:

  int some_var = initializer();

into:

  int some_var;
  if (!value)
        return config_error_nonbool(var);
  some_var = initializer();

There are still some broken instances after this patch, which I'll
address on their own in individual patches after this one.

Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/blame.c        |  2 ++
 builtin/checkout.c     |  2 ++
 builtin/clone.c        |  2 ++
 builtin/log.c          |  5 ++++-
 builtin/pack-objects.c |  6 +++++-
 compat/mingw.c         |  2 ++
 config.c               |  8 ++++++++
 diff.c                 | 19 ++++++++++++++++---
 mailinfo.c             |  2 ++
 notes-utils.c          |  2 ++
 trailer.c              |  2 ++
 11 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 9c987d6567..2433b7da5c 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -748,6 +748,8 @@ static int git_blame_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "blame.coloring")) {
+		if (!value)
+			return config_error_nonbool(var);
 		if (!strcmp(value, "repeatedLines")) {
 			coloring_mode |= OUTPUT_COLOR_LINE;
 		} else if (!strcmp(value, "highlightRecent")) {
diff --git a/builtin/checkout.c b/builtin/checkout.c
index f02434bc15..d5c784854f 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1202,6 +1202,8 @@ static int git_checkout_config(const char *var, const char *value,
 	struct checkout_opts *opts = cb;
 
 	if (!strcmp(var, "diff.ignoresubmodules")) {
+		if (!value)
+			return config_error_nonbool(var);
 		handle_ignore_submodules_arg(&opts->diff_options, value);
 		return 0;
 	}
diff --git a/builtin/clone.c b/builtin/clone.c
index c6357af949..54d9b9976a 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -791,6 +791,8 @@ static int git_clone_config(const char *k, const char *v,
 			    const struct config_context *ctx, void *cb)
 {
 	if (!strcmp(k, "clone.defaultremotename")) {
+		if (!v)
+			return config_error_nonbool(k);
 		free(remote_name);
 		remote_name = xstrdup(v);
 	}
diff --git a/builtin/log.c b/builtin/log.c
index ba775d7b5c..3ce41c4856 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -594,8 +594,11 @@ static int git_log_config(const char *var, const char *value,
 			decoration_style = 0; /* maybe warn? */
 		return 0;
 	}
-	if (!strcmp(var, "log.diffmerges"))
+	if (!strcmp(var, "log.diffmerges")) {
+		if (!value)
+			return config_error_nonbool(var);
 		return diff_merges_config(value);
+	}
 	if (!strcmp(var, "log.showroot")) {
 		default_show_root = git_config_bool(var, value);
 		return 0;
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 89a8b5a976..62c540b4db 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3204,14 +3204,18 @@ static int git_pack_config(const char *k, const char *v,
 		return 0;
 	}
 	if (!strcmp(k, "uploadpack.blobpackfileuri")) {
-		struct configured_exclusion *ex = xmalloc(sizeof(*ex));
+		struct configured_exclusion *ex;
 		const char *oid_end, *pack_end;
 		/*
 		 * Stores the pack hash. This is not a true object ID, but is
 		 * of the same form.
 		 */
 		struct object_id pack_hash;
 
+		if (!v)
+			return config_error_nonbool(k);
+
+		ex = xmalloc(sizeof(*ex));
 		if (parse_oid_hex(v, &ex->e.oid, &oid_end) ||
 		    *oid_end != ' ' ||
 		    parse_oid_hex(oid_end + 1, &pack_hash, &pack_end) ||
diff --git a/compat/mingw.c b/compat/mingw.c
index ec5280da16..42053c1f65 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -255,6 +255,8 @@ int mingw_core_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "core.unsetenvvars")) {
+		if (!value)
+			return config_error_nonbool(var);
 		free(unset_environment_variables);
 		unset_environment_variables = xstrdup(value);
 		return 0;
diff --git a/config.c b/config.c
index b330c7adb4..18085c7e38 100644
--- a/config.c
+++ b/config.c
@@ -1386,6 +1386,8 @@ static int git_default_core_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "core.checkstat")) {
+		if (!value)
+			return config_error_nonbool(var);
 		if (!strcasecmp(value, "default"))
 			check_stat = 1;
 		else if (!strcasecmp(value, "minimal"))
@@ -1547,11 +1549,15 @@ static int git_default_core_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "core.checkroundtripencoding")) {
+		if (!value)
+			return config_error_nonbool(var);
 		check_roundtrip_encoding = xstrdup(value);
 		return 0;
 	}
 
 	if (!strcmp(var, "core.notesref")) {
+		if (!value)
+			return config_error_nonbool(var);
 		notes_ref_name = xstrdup(value);
 		return 0;
 	}
@@ -1619,6 +1625,8 @@ static int git_default_core_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "core.createobject")) {
+		if (!value)
+			return config_error_nonbool(var);
 		if (!strcmp(value, "rename"))
 			object_creation_mode = OBJECT_CREATION_USES_RENAMES;
 		else if (!strcmp(value, "link"))
diff --git a/diff.c b/diff.c
index 2c602df10a..5b213a4b44 100644
--- a/diff.c
+++ b/diff.c
@@ -372,7 +372,10 @@ int git_diff_ui_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "diff.colormovedws")) {
-		unsigned cm = parse_color_moved_ws(value);
+		unsigned cm;
+		if (!value)
+			return config_error_nonbool(var);
+		cm = parse_color_moved_ws(value);
 		if (cm & COLOR_MOVED_WS_ERROR)
 			return -1;
 		diff_color_moved_ws_default = cm;
@@ -426,10 +429,15 @@ int git_diff_ui_config(const char *var, const char *value,
 	if (!strcmp(var, "diff.orderfile"))
 		return git_config_pathname(&diff_order_file_cfg, var, value);
 
-	if (!strcmp(var, "diff.ignoresubmodules"))
+	if (!strcmp(var, "diff.ignoresubmodules")) {
+		if (!value)
+			return config_error_nonbool(var);
 		handle_ignore_submodules_arg(&default_diff_options, value);
+	}
 
 	if (!strcmp(var, "diff.submodule")) {
+		if (!value)
+			return config_error_nonbool(var);
 		if (parse_submodule_params(&default_diff_options, value))
 			warning(_("Unknown value for 'diff.submodule' config variable: '%s'"),
 				value);
@@ -473,7 +481,10 @@ int git_diff_basic_config(const char *var, const char *value,
 	}
 
 	if (!strcmp(var, "diff.wserrorhighlight")) {
-		int val = parse_ws_error_highlight(value);
+		int val;
+		if (!value)
+			return config_error_nonbool(var);
+		val = parse_ws_error_highlight(value);
 		if (val < 0)
 			return -1;
 		ws_error_highlight_default = val;
@@ -490,6 +501,8 @@ int git_diff_basic_config(const char *var, const char *value,
 
 	if (!strcmp(var, "diff.dirstat")) {
 		struct strbuf errmsg = STRBUF_INIT;
+		if (!value)
+			return config_error_nonbool(var);
 		default_diff_options.dirstat_permille = diff_dirstat_permille_default;
 		if (parse_dirstat_params(&default_diff_options, value, &errmsg))
 			warning(_("Found errors in 'diff.dirstat' config variable:\n%s"),
diff --git a/mailinfo.c b/mailinfo.c
index a07d2da16d..093bed5d8f 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1253,6 +1253,8 @@ static int git_mailinfo_config(const char *var, const char *value,
 		return 0;
 	}
 	if (!strcmp(var, "mailinfo.quotedcr")) {
+		if (!value)
+			return config_error_nonbool(var);
 		if (mailinfo_parse_quoted_cr_action(value, &mi->quoted_cr) != 0)
 			return error(_("bad action '%s' for '%s'"), value, var);
 		return 0;
diff --git a/notes-utils.c b/notes-utils.c
index 97c031c26e..01f4f5b424 100644
--- a/notes-utils.c
+++ b/notes-utils.c
@@ -112,6 +112,8 @@ static int notes_rewrite_config(const char *k, const char *v,
 		}
 		return 0;
 	} else if (!c->refs_from_env && !strcmp(k, "notes.rewriteref")) {
+		if (!v)
+			return config_error_nonbool(k);
 		/* note that a refs/ prefix is implied in the
 		 * underlying for_each_glob_ref */
 		if (starts_with(v, "refs/notes/"))
diff --git a/trailer.c b/trailer.c
index b6de5d9cb2..b0e2ec224a 100644
--- a/trailer.c
+++ b/trailer.c
@@ -507,6 +507,8 @@ static int git_trailer_default_config(const char *conf_key, const char *value,
 				warning(_("unknown value '%s' for key '%s'"),
 					value, conf_key);
 		} else if (!strcmp(trailer_item, "separators")) {
+			if (!value)
+				return config_error_nonbool(conf_key);
 			separators = xstrdup(value);
 		}
 	}
-- 
2.43.0.664.ga12c899002


^ permalink raw reply related

* [PATCH 2/7] setup: handle NULL value when parsing extensions
From: Jeff King @ 2023-12-07  7:11 UTC (permalink / raw)
  To: git; +Cc: Carlos Andrés Ramírez Cataño
In-Reply-To: <20231207071030.GA1275835@coredump.intra.peff.net>

The "partialclone" extension config records a string, and hence it is an
error to have an implicit bool like:

  [extensions]
  partialclone

in your config. We should recognize and reject this, rather than
segfaulting (which is the current behavior). Note that it's OK to use
config_error_nonbool() here, even though the return value is an enum. We
explicitly document EXTENSION_ERROR as -1 for compatibility with
error(), etc.

This is the only extension value that has this problem. Most of the
others are bools that interpret this value naturally. The exception is
extensions.objectformat, which does correctly check for NULL.

Signed-off-by: Jeff King <peff@peff.net>
---
 setup.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/setup.c b/setup.c
index fc592dc6dd..50131be7cc 100644
--- a/setup.c
+++ b/setup.c
@@ -559,6 +559,8 @@ static enum extension_result handle_extension_v0(const char *var,
 			data->precious_objects = git_config_bool(var, value);
 			return EXTENSION_OK;
 		} else if (!strcmp(ext, "partialclone")) {
+			if (!value)
+				return config_error_nonbool(var);
 			data->partial_clone = xstrdup(value);
 			return EXTENSION_OK;
 		} else if (!strcmp(ext, "worktreeconfig")) {
-- 
2.43.0.664.ga12c899002


^ permalink raw reply related

* [PATCH 3/7] trace2: handle NULL values in tr2_sysenv config callback
From: Jeff King @ 2023-12-07  7:11 UTC (permalink / raw)
  To: git; +Cc: Carlos Andrés Ramírez Cataño
In-Reply-To: <20231207071030.GA1275835@coredump.intra.peff.net>

If you have config with an implicit bool like:

  [trace2]
  envvars

we'll segfault, as we unconditionally try to xstrdup() the value. We
should instead detect and complain, as a boolean value has no meaning
here. The same is true for every variable in tr2_sysenv_settings (and
this patch covers them all, as we check them in a loop).

Signed-off-by: Jeff King <peff@peff.net>
---
 trace2/tr2_sysenv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/trace2/tr2_sysenv.c b/trace2/tr2_sysenv.c
index d3ecac2772..048cdd5438 100644
--- a/trace2/tr2_sysenv.c
+++ b/trace2/tr2_sysenv.c
@@ -68,6 +68,8 @@ static int tr2_sysenv_cb(const char *key, const char *value,
 
 	for (k = 0; k < ARRAY_SIZE(tr2_sysenv_settings); k++) {
 		if (!strcmp(key, tr2_sysenv_settings[k].git_config_name)) {
+			if (!value)
+				return config_error_nonbool(key);
 			free(tr2_sysenv_settings[k].value);
 			tr2_sysenv_settings[k].value = xstrdup(value);
 			return 0;
-- 
2.43.0.664.ga12c899002


^ permalink raw reply related

* [PATCH 4/7] help: handle NULL value for alias.* config
From: Jeff King @ 2023-12-07  7:11 UTC (permalink / raw)
  To: git; +Cc: Carlos Andrés Ramírez Cataño
In-Reply-To: <20231207071030.GA1275835@coredump.intra.peff.net>

When showing all config with "git help --all", we print the list of
defined aliases. But our config callback to do so does not check for a
NULL value, meaning a config block like:

  [alias]
  foo

will cause us to segfault. We should detect and complain about this in
the usual way.

Since this command is purely informational (and we aren't trying to run
the alias), we could perhaps just generate a warning and continue. But
this sort of misconfiguration should be pretty rare, and the error
message we will produce points directly to the line of config that needs
to be fixed. So just generating the usual error should be OK.

Signed-off-by: Jeff King <peff@peff.net>
---
 help.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/help.c b/help.c
index 6d2ebfbd2a..2dbe57b413 100644
--- a/help.c
+++ b/help.c
@@ -464,8 +464,11 @@ static int get_alias(const char *var, const char *value,
 {
 	struct string_list *list = data;
 
-	if (skip_prefix(var, "alias.", &var))
+	if (skip_prefix(var, "alias.", &var)) {
+		if (!value)
+			return config_error_nonbool(var);
 		string_list_append(list, var)->util = xstrdup(value);
+	}
 
 	return 0;
 }
-- 
2.43.0.664.ga12c899002


^ permalink raw reply related

* [PATCH 5/7] submodule: handle NULL value when parsing submodule.*.branch
From: Jeff King @ 2023-12-07  7:11 UTC (permalink / raw)
  To: git; +Cc: Carlos Andrés Ramírez Cataño
In-Reply-To: <20231207071030.GA1275835@coredump.intra.peff.net>

We record the submodule branch config value as a string, so config that
uses an implicit bool like:

  [submodule "foo"]
  branch

will cause us to segfault. Note that unlike most other config-parsing
bugs of this class, this can be triggered by parsing a bogus .gitmodules
file (which we might do after cloning a malicious repository).

I don't think the security implications are important, though. It's
always a strict NULL dereference, not an out-of-bounds read or write. So
we should reliably kill the process. That may be annoying, but the
impact is limited to the attacker preventing the victim from
successfully using "git clone --recurse-submodules", etc, on the
malicious repo.

The "branch" entry is the only one with this problem; other strings like
"path" and "url" already check for NULL.

Signed-off-by: Jeff King <peff@peff.net>
---
 submodule-config.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/submodule-config.c b/submodule-config.c
index 6a48fd12f6..f4dd482abc 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -516,7 +516,9 @@ static int parse_config(const char *var, const char *value,
 			submodule->recommend_shallow =
 				git_config_bool(var, value);
 	} else if (!strcmp(item.buf, "branch")) {
-		if (!me->overwrite && submodule->branch)
+		if (!value)
+			ret = config_error_nonbool(var);
+		else if (!me->overwrite && submodule->branch)
 			warn_multiple_config(me->treeish_name, submodule->name,
 					     "branch");
 		else {
-- 
2.43.0.664.ga12c899002


^ permalink raw reply related

* [PATCH 6/7] trailer: handle NULL value when parsing trailer-specific config
From: Jeff King @ 2023-12-07  7:11 UTC (permalink / raw)
  To: git; +Cc: Carlos Andrés Ramírez Cataño
In-Reply-To: <20231207071030.GA1275835@coredump.intra.peff.net>

When parsing the "key", "command", and "cmd" trailer config, we just
make a copy of the value string.  If we see an implicit bool like:

  [trailer "foo"]
  key

we'll segfault trying to copy a NULL pointer. We can fix this with the
usual config_error_nonbool() check.

I split this out from the other vanilla cases, because at first glance
it looks like a better fix here would be to move the NULL check out of
the switch statement. But it would change the behavior of other keys
like trailer.*.ifExists, where an implicit bool is interpreted as
EXISTS_DEFAULT.

Signed-off-by: Jeff King <peff@peff.net>
---
 trailer.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/trailer.c b/trailer.c
index b0e2ec224a..e4b08ed267 100644
--- a/trailer.c
+++ b/trailer.c
@@ -553,16 +553,22 @@ static int git_trailer_config(const char *conf_key, const char *value,
 	case TRAILER_KEY:
 		if (conf->key)
 			warning(_("more than one %s"), conf_key);
+		if (!value)
+			return config_error_nonbool(conf_key);
 		conf->key = xstrdup(value);
 		break;
 	case TRAILER_COMMAND:
 		if (conf->command)
 			warning(_("more than one %s"), conf_key);
+		if (!value)
+			return config_error_nonbool(conf_key);
 		conf->command = xstrdup(value);
 		break;
 	case TRAILER_CMD:
 		if (conf->cmd)
 			warning(_("more than one %s"), conf_key);
+		if (!value)
+			return config_error_nonbool(conf_key);
 		conf->cmd = xstrdup(value);
 		break;
 	case TRAILER_WHERE:
-- 
2.43.0.664.ga12c899002


^ permalink raw reply related

* [PATCH 7/7] fsck: handle NULL value when parsing message config
From: Jeff King @ 2023-12-07  7:11 UTC (permalink / raw)
  To: git; +Cc: Carlos Andrés Ramírez Cataño
In-Reply-To: <20231207071030.GA1275835@coredump.intra.peff.net>

When parsing fsck.*, receive.fsck.*, or fetch.fsck.*, we don't check for
an implicit bool. So any of:

  [fsck]
  badTree
  [receive "fsck"]
  badTree
  [fetch "fsck"]
  badTree

will cause us to segfault. We can fix it with config_error_nonbool() in
the usual way, but we have to make a few more changes to get good error
messages. The problem is that all three spots do:

  if (skip_prefix(var, "fsck.", &var))

to match and parse the actual message id. But that means that "var" now
just says "badTree" instead of "receive.fsck.badTree", making the
resulting message confusing. We can fix that by storing the parsed
message id in its own separate variable.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/receive-pack.c | 11 +++++++----
 fetch-pack.c           | 12 ++++++++----
 fsck.c                 |  8 ++++++--
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8c4f0cb90a..ccf9738bce 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -142,6 +142,7 @@ static enum deny_action parse_deny_action(const char *var, const char *value)
 static int receive_pack_config(const char *var, const char *value,
 			       const struct config_context *ctx, void *cb)
 {
+	const char *msg_id;
 	int status = parse_hide_refs_config(var, value, "receive", &hidden_refs);
 
 	if (status)
@@ -178,12 +179,14 @@ static int receive_pack_config(const char *var, const char *value,
 		return 0;
 	}
 
-	if (skip_prefix(var, "receive.fsck.", &var)) {
-		if (is_valid_msg_type(var, value))
+	if (skip_prefix(var, "receive.fsck.", &msg_id)) {
+		if (!value)
+			return config_error_nonbool(var);
+		if (is_valid_msg_type(msg_id, value))
 			strbuf_addf(&fsck_msg_types, "%c%s=%s",
-				fsck_msg_types.len ? ',' : '=', var, value);
+				fsck_msg_types.len ? ',' : '=', msg_id, value);
 		else
-			warning("skipping unknown msg id '%s'", var);
+			warning("skipping unknown msg id '%s'", msg_id);
 		return 0;
 	}
 
diff --git a/fetch-pack.c b/fetch-pack.c
index 26999e3b65..31a72d43de 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1862,6 +1862,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 static int fetch_pack_config_cb(const char *var, const char *value,
 				const struct config_context *ctx, void *cb)
 {
+	const char *msg_id;
+
 	if (strcmp(var, "fetch.fsck.skiplist") == 0) {
 		const char *path;
 
@@ -1873,12 +1875,14 @@ static int fetch_pack_config_cb(const char *var, const char *value,
 		return 0;
 	}
 
-	if (skip_prefix(var, "fetch.fsck.", &var)) {
-		if (is_valid_msg_type(var, value))
+	if (skip_prefix(var, "fetch.fsck.", &msg_id)) {
+		if (!value)
+			return config_error_nonbool(var);
+		if (is_valid_msg_type(msg_id, value))
 			strbuf_addf(&fsck_msg_types, "%c%s=%s",
-				fsck_msg_types.len ? ',' : '=', var, value);
+				fsck_msg_types.len ? ',' : '=', msg_id, value);
 		else
-			warning("Skipping unknown msg id '%s'", var);
+			warning("Skipping unknown msg id '%s'", msg_id);
 		return 0;
 	}
 
diff --git a/fsck.c b/fsck.c
index 6a0bbc5087..b624083a13 100644
--- a/fsck.c
+++ b/fsck.c
@@ -1403,6 +1403,8 @@ int git_fsck_config(const char *var, const char *value,
 		    const struct config_context *ctx, void *cb)
 {
 	struct fsck_options *options = cb;
+	const char *msg_id;
+
 	if (strcmp(var, "fsck.skiplist") == 0) {
 		const char *path;
 		struct strbuf sb = STRBUF_INIT;
@@ -1416,8 +1418,10 @@ int git_fsck_config(const char *var, const char *value,
 		return 0;
 	}
 
-	if (skip_prefix(var, "fsck.", &var)) {
-		fsck_set_msg_type(options, var, value);
+	if (skip_prefix(var, "fsck.", &msg_id)) {
+		if (!value)
+			return config_error_nonbool(var);
+		fsck_set_msg_type(options, msg_id, value);
 		return 0;
 	}
 
-- 
2.43.0.664.ga12c899002

^ permalink raw reply related

* Re: [PATCH 4/7] builtin/clone: fix bundle URIs with mismatching object formats
From: Patrick Steinhardt @ 2023-12-07  7:22 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git
In-Reply-To: <CAOLa=ZQc=7Z3w9JAdzS23P=c=KSYZJR6gJSLOHdU-d92Y3kJ5A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

On Wed, Dec 06, 2023 at 10:13:43PM +0100, Karthik Nayak wrote:
> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote:
> > The first Git step where we expect the repository to be fully initalized
> > is when we fetch bundles via bundle URIs. Funny enough, the comments
> > there also state that "the_repository must match the cloned repo", which
> > is indeed not necessarily the case for the hash algorithm right now. So
> > in practice it is the right thing to detect the remote's object format
> > before downloading bundle URIs anyway, and not doing so causes clones
> > with bundle URIS to fail when the local default object format does not
> > match the remote repository's format.
> >
> 
> Nit: s/URIS/URIs

Thanks, fixed locally. Will wait with v2 though until there are more
review comments.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 1/7] setup: extract function to create the refdb
From: Patrick Steinhardt @ 2023-12-07  7:22 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: git
In-Reply-To: <CAOLa=ZSZztJUF9nmSzGdOW0oWBRUp2sw8QyuZO_q06cNymad3Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2242 bytes --]

On Wed, Dec 06, 2023 at 10:10:37PM +0100, Karthik Nayak wrote:
> On Wed, Dec 6, 2023 at 1:40 PM Patrick Steinhardt <ps@pks.im> wrote:
> > +static void create_reference_database(const char *initial_branch, int quiet)
> > +{
> > +       struct strbuf err = STRBUF_INIT;
> > +       int reinit = is_reinit();
> > +
> > +       /*
> > +        * We need to create a "refs" dir in any case so that older
> > +        * versions of git can tell that this is a repository.
> > +        */
> 
> How does this work though, even if an earlier version of git can tell
> that this is a repository, it still won't be able to read the reftable
> backend. In that sense, what do we achieve here?

This is a good question, and there is related ongoing discussion about
this topic in the thread starting at [1]. There are a few benefits to
letting clients discover such repos even if they don't understand the
new reference backend format:

  - They know to stop walking up the parent-directory chain. Otherwise a
    client might end up detecting a Git repository in the parent dir.

  - The user gets a proper error message why the repository cannot be
    accessed. Instead of failing to detect the repository altogether we
    instead say that we don't understand the "extensions.refFormat"
    extension.

Maybe there are other cases I can't think of right now.

> > +       safe_create_dir(git_path("refs"), 1);
> > +       adjust_shared_perm(git_path("refs"));
> > +
> 
> Not related to your commit per se, but we ignore the return value
> here, shouldn't we die in this case?

While the end result wouldn't be quite what the user asks for, the only
negative consequence is that the repository is inaccessible to others. I
think this failure mode is comparatively benign -- if it were the other
way round and we'd over-share the repository it would more severe.

So while I don't think that dying makes much sense here, I could
certainly see us adding a warning so that the user at least knows that
something went wrong. I'd rather want to keep this out of the current
patch series, but could certainly see such a warning added in a follow
up patch series.

Patrick

[1]: <ZWcOvjGPVS_CMUAk@tanuki>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox