Git development
 help / color / mirror / Atom feed
* Orphan branch not well-defined?
From: Craig H Maynard @ 2023-11-22  0:28 UTC (permalink / raw)
  To: Git Community

Team,

Recently I tried creating an orphan branch in an existing repo using git-checkout and git-switch. 

Both commands have an --orphan option.

The results were different:

git checkout --orphan <newbranch> retained the entire working tree, including subdirectories and their contents.

git switch --orphan <newbranch> retained subdirectories but NOT their content.

Leaving aside the question of whether or not this is a bug, there doesn't appear to be any formal definition of the term "orphan branch" in the git documentation. Am I missing something?

Thanks,
Craig



^ permalink raw reply

* Re: Orphan branch not well-defined?
From: Junio C Hamano @ 2023-11-22  1:08 UTC (permalink / raw)
  To: Craig H Maynard; +Cc: Git Community
In-Reply-To: <FE2AD666-88DE-4F70-8D6D-3A426689EB41@me.com>

Craig H Maynard <chmaynard@me.com> writes:

> Recently I tried creating an orphan branch in an existing repo using git-checkout and git-switch. 
>
> Both commands have an --orphan option.
>
> The results were different:

This is one of the very much deliberate differences between "switch"
and "checkout" (there are others).

The "switch" command was introduced as an experiment to figure out a
better UI choices for one half of "checkout", which deals with
checking out a branch to a working tree (the other half being
"restore", which is about checking out files out of a tree-ish).
The initial round of "switch" proposed to go with the identical
semantics as "checkout" for the "--orphan" option [*1*], but during
review discussion, a concensus was reached that a better behaviour
for creating an entirely new history may be to start from void
[*2*], and that is what is in the experimental command you see.


[Reference]

*1* https://lore.kernel.org/git/20190130094831.10420-9-pclouds@gmail.com/

*2* https://lore.kernel.org/git/CABPp-BF3_p3+fmQcWYEu2z3J4FfPmDmiMyFiBRXyz8TxKLL7jA@mail.gmail.com/


^ permalink raw reply

* Re: Orphan branch not well-defined?
From: Chris Torek @ 2023-11-22  1:42 UTC (permalink / raw)
  To: Craig H Maynard; +Cc: Git Community
In-Reply-To: <FE2AD666-88DE-4F70-8D6D-3A426689EB41@me.com>

On Tue, Nov 21, 2023 at 4:36 PM Craig H Maynard <chmaynard@me.com> wrote:
> [git checkout and git switch treat --orphan differently]
>
> Leaving aside the question of whether or not this is a bug,

Just to answer the implied question: this is intentional.

> there doesn't appear to be any formal definition of the term "orphan branch"
> in the git documentation. Am I missing something?

Whether it's documented anywhere or not, it's not done well. This is
not surprising: It is hard to do it well!  Git uses two phrases for
this: "orphan branch" and "unborn branch". To understand them
properly, let's start at the real beginning.  Bear with me for a
moment here.

In Git, the identity of a commit -- the way that Git locates the
commit internally -- is its hash ID.  (Aside: until the SHA-256
conversion, therewas only ever one hash ID for any commit ever made
anywhere.  Now that Git supports both SHA-1 and SHA-256, there are two
possible IDs, depending on which scheme you're using.)  It's possible,
at least in theory, to use Git without ever creating a branch name:
all you have to do is memorize these random-looking hash IDs.  But
that's not how people's brains work, and it's quite impractical.  So
Git offers us branch names, like "main" or "master", "dev" or
"develop", and so on.

In Git, a branch name is just a human-readable name for one of Git's
internal hash IDs, with a special and very useful property that
distinguishes it from a tag name.  Each tag name is a human-readable
name for a hash ID too; they just lack the special property of the
branch names.  We won't get into all the properties here though, and
for the moment, we just need to know that the name stands in as a
memorable version of the ID.

As a result, a Git branch name literally cannot exist unless it
identifies one specific commit!  We call that one specific commit the
"tip commit" of that branch (which introduces a whole new confusion,
of whether a "branch" is *one commit* or *many commits*, but again we
won't get into this here).

This leaves us with a big "chicken or egg" problem
(https://en.wikipedia.org/wiki/Chicken_or_the_egg).  Suppose we've
just created a new, empty repository, which by definition has no
commits in it: it's *new*, and *empty*.  How many branch names can we
have in this new, empty repository?  We've just claimed that a branch
name must identify some specific commit, and we have no commits, so
the answer is: none.  We cannot have any branch names at all.

But -- here's the other paradox -- whenever we make a *new* commit,
it's to be *added on to the current branch*.  But we have an empty
repository, which cannot have any branch names, so how do we know what
the "current branch" even *is*?

** Unborn Branch is the better term **

Now that we understand the basic problem -- that a new repository
can't have any branches, but that we want Git to *create* a branch
when we make that very first commit -- we can see what an "orphan" or
"unborn" branch is all about.  It papers over our chicken-or-egg
problem.  We simply save the *name we want Git to create* somewhere,
then we make a new commit as usual.  When, eventually, we do make that
commit, Git says: "OK, I should add this new commit to the current
branch br1", or whatever the name is.  Git then creates the new commit
*and* creates the branch name, all as one big operation.  Now the
branch exists: it's born.

When we have a normal (not-unborn) branch and create a new commit, Git
creates the new commit as usual and then -- here's the unique property
of *branch names* that makes them so special -- *updates* the branch
name to hold the new commit's new hash ID.  Git also makes sure that
the new commit we just made links back to the commit that *was* the
tip commit of the branch, just a moment ago.  So this is how branches
"grow" as you make commits.  The *name* holds only the *last* commit
hash ID.  Each commit holds the *previous* hash ID, so that Git can
start at the end of a branch and work backwards.  The previous, or
parent, commit, has its own parent, which has another parent, all the
way back to the beginning of time.

This is also where the dual meaning of "branch" clears up somewhat: a
branch is both the tip commit *and* the whole-chain-of-commits,
starting at the tip and working backwards.  How do we know which
meaning someone means?  Sometimes it's clear from context.  Sometimes
it's not clear.  Sometimes whoever used the word isn't even aware of
the issue!

** The `--orphan` options **

That weird problematic state for a *new* repository, where no branches
can exist, yet you want to be "on" the branch you're going to create,
only exists as a problem for a new and empty repository.  But given
that Git has to solve that problem, Git can let you enter that weird
state any time.  That's what `--orphan` was originally invented for:
to go back into that state even if you have some commits.

That is, `git checkout --orphan` meant: make the current branch name
be an unborn branch, the way it is in a new and totally-empty
repository.  Then when I make my next commit, that will create a new
commit that has no parent commit.  Whether (and when and how) this is
actually useful is another question entirely, as is the reason for
switch and checkout behaving differently in terms of how they treat
the index and working tree.  But this is the heart of the option: it
means "go into the unborn branch state".

(Side note: there are other ways to solve the "new repository"
problem, and there are other ways to define "branch".  Other version
control systems sometimes use other ways.  Git's rather peculiar
definition of branch was rare, perhaps even unique, in the early days
of Git.)

Chris

^ permalink raw reply

* [commit-graph] v2.43.0 segfault with fetch.writeCommitGraph enabled when fetch
From: ryenus @ 2023-11-22  4:05 UTC (permalink / raw)
  To: Git mailing list

The issue appeared after updating to git v2.43.0, now git fetch would cause
segmentation fault when commit graph is enabled. Though I only observed this
issue in a repo with two submodules, regardless of whether the submodules are
checked out or not. Meanwhile most other repos without submodules worked fine.

> 15:57:10.660377 run-command.c:726                 child_start[2] git maintenance run --auto --no-quiet
> 15:57:10.665870 common-main.c:55                  version 2.43.0
> 15:57:10.666265 common-main.c:56                  start /opt/homebrew/opt/git/libexec/git-core/git maintenance run --auto --no-quiet
> 15:57:10.666469 repository.c:143                  worktree /path/to/repo/sub/module2
> 15:57:10.666649 git.c:464                         cmd_name maintenance (_run_dashed_/_run_git_alias_/pull/fetch/fetch/maintenance)
> 15:57:10.668232 git.c:723                         exit elapsed:0.003405 code:0
> 15:57:10.668241 trace2/tr2_tgt_normal.c:127       atexit elapsed:0.003415 code:0
> 15:57:10.668611 run-command.c:979                 child_exit[2] pid:46018 code:0 elapsed:0.008227
> 15:57:10.668635 git.c:723                         exit elapsed:1.837179 code:0
> 15:57:10.668639 trace2/tr2_tgt_normal.c:127       atexit elapsed:1.837182 code:0
> 15:57:10.669007 run-command.c:979                 child_exit[3] pid:46006 code:0 elapsed:1.843739
> 15:57:10.671522 usage.c:80                        error fetch died of signal 11
> error: fetch died of signal 11
> 15:57:10.671645 run-command.c:979                 child_exit[1] pid:45980 code:139 elapsed:5.292927
> 15:57:10.671658 git.c:723                         exit elapsed:5.337363 code:1
> 15:57:10.671663 trace2/tr2_tgt_normal.c:127       atexit elapsed:5.337368 code:1
> 15:57:10.672048 run-command.c:979                 child_exit[1] pid:45978 code:1 elapsed:5.345050
> 15:57:10.672099 git.c:819                         exit elapsed:5.355644 code:1
> 15:57:10.672105 trace2/tr2_tgt_normal.c:127       atexit elapsed:5.355649 code:1


Subsequent `git fetch` would then fail due to the left over lock file:

> 15:57:19.059375 run-command.c:726                 child_start[2] git maintenance run --auto --no-quiet
> 15:57:19.065689 common-main.c:55                  version 2.43.0
> 15:57:19.066027 common-main.c:56                  start /opt/homebrew/opt/git/libexec/git-core/git maintenance run --auto --no-quiet
> 15:57:19.066206 repository.c:143                  worktree /path/to/repo/sub/module2
> 15:57:19.066387 git.c:464                         cmd_name maintenance (_run_dashed_/_run_git_alias_/pull/fetch/fetch/maintenance)
> 15:57:19.067888 git.c:723                         exit elapsed:0.003122 code:0
> 15:57:19.067896 trace2/tr2_tgt_normal.c:127       atexit elapsed:0.003131 code:0
> 15:57:19.068239 run-command.c:979                 child_exit[2] pid:46076 code:0 elapsed:0.008852
> 15:57:19.068276 git.c:723                         exit elapsed:1.661854 code:0
> 15:57:19.068281 trace2/tr2_tgt_normal.c:127       atexit elapsed:1.661858 code:0
> 15:57:19.068714 run-command.c:979                 child_exit[3] pid:46065 code:0 elapsed:1.667321
> 15:57:19.069327 usage.c:61                        error Unable to create '/path/to/repo/.git/objects/info/commit-graphs/commit-graph-chain.lock': File exists.
>
> Another git process seems to be running in this repository, e.g.
> an editor opened by 'git commit'. Please make sure all processes
> are terminated then try again. If it still fails, a git process
> may have crashed in this repository earlier:
> remove the file manually to continue.
> fatal: Unable to create '/path/to/repo/.git/objects/info/commit-graphs/commit-graph-chain.lock': File exists.

As a workaround, I disabled fetch.writeCommitGraph to get fetch work again.

^ permalink raw reply

* Re: [PATCH v2 1/6] submodule--helper: use submodule_from_path in set-{url,branch}
From: Junio C Hamano @ 2023-11-22  7:54 UTC (permalink / raw)
  To: Jan Alexander Steffens (heftig)
  Cc: git, Shourya Shukla, Ævar Arnfjörð Bjarmason,
	Denton Liu
In-Reply-To: <20231121203413.176414-1-heftig@archlinux.org>

"Jan Alexander Steffens (heftig)" <heftig@archlinux.org> writes:

> Notes:
>     v2 changes:
>         - fixed code style
>         - replaced potentially unsafe use of `sub->path` with `path`

Hasn't the previous iteration of this topic long been merged to not
just 'next' but to 'master' and appears in a released version of Git?

We are all human, so mistakes are inevitable, but if we discover a
mistake that needs fixing after a topic hits 'next', we take it as a
sign that the particular mistake was easy to make and hard to spot,
and the fix for it deserves its own explanation.  Please make an
incremental update on top of what has already been merged with a
good explanation (explain why sub->path is "potentially unsafe" and
why using path is better, for example).

Thanks.

^ permalink raw reply

* [PATCH] git-p4: fix fast import when empty commit is first
From: Alisha Kim via GitGitGadget @ 2023-11-22  7:56 UTC (permalink / raw)
  To: git; +Cc: Alisha Kim, Alisha Kim

From: Alisha Kim <pril@pril.cc>

When executing p4 sync by specifying an excluded path, an empty commit
will be created if there is only a change in the excluded path in
revision.
If git-p4.keepEmptyCommits is turned off and an empty commit is the
first, fast-import will fail.

Signed-off-by: Alisha Kim <pril@pril.cc>
---
    git-p4: fix fast import when empty commit is first
    
    When executing p4 sync by specifying an excluded path, an empty commit
    will be created if there is only a change in the excluded path in
    revision. If git-p4.keepEmptyCommits is turned off and an empty commit
    is the first, fast-import will fail.
    
    The error log is as follows Ignoring revision 14035 as it would produce
    an empty commit. fast-import failed: warning: Not updating
    refs/heads/p4/master (new tip new commit hash does not contain parent
    commit hash) fast-import statistics: ...

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1609%2Fdaebo01%2Fgit-p4-pr-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1609/daebo01/git-p4-pr-v1
Pull-Request: https://github.com/git/git/pull/1609

 git-p4.py | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 0eb3bb4c47d..a61200e29e4 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -3466,7 +3466,7 @@ class P4Sync(Command, P4UserMap):
         if not files and not allow_empty:
             print('Ignoring revision {0} as it would produce an empty commit.'
                 .format(details['change']))
-            return
+            return False
 
         self.gitStream.write("commit %s\n" % branch)
         self.gitStream.write("mark :%s\n" % details["change"])
@@ -3533,6 +3533,8 @@ class P4Sync(Command, P4UserMap):
                     print("Tag %s does not match with change %s: file count is different."
                            % (labelDetails["label"], change))
 
+        return True
+
     def getLabels(self):
         """Build a dictionary of changelists and labels, for "detect-labels"
            option.
@@ -3876,10 +3878,12 @@ class P4Sync(Command, P4UserMap):
                             self.commit(description, filesForCommit, branch, parent)
                 else:
                     files = self.extractFilesFromCommit(description)
-                    self.commit(description, files, self.branch,
+                    isCommitted = self.commit(description, files, self.branch,
                                 self.initialParent)
                     # only needed once, to connect to the previous commit
-                    self.initialParent = ""
+                    if isCommitted:
+                        self.initialParent = ""
+
             except IOError:
                 print(self.gitError.read())
                 sys.exit(1)

base-commit: cfb8a6e9a93adbe81efca66e6110c9b4d2e57169
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH v2 1/6] submodule--helper: use submodule_from_path in set-{url,branch}
From: Jan Alexander Steffens (heftig) @ 2023-11-22  9:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Shourya Shukla, Ævar Arnfjörð Bjarmason,
	Denton Liu
In-Reply-To: <xmqq4jhedxza.fsf@gitster.g>

On Wed, 2023-11-22 at 16:54 +0900, Junio C Hamano wrote:
> "Jan Alexander Steffens (heftig)" <heftig@archlinux.org> writes:
> 
> > Notes:
> >     v2 changes:
> >         - fixed code style
> >         - replaced potentially unsafe use of `sub->path` with
> > `path`
> 
> Hasn't the previous iteration of this topic long been merged to not
> just 'next' but to 'master' and appears in a released version of Git?
> 
> We are all human, so mistakes are inevitable, but if we discover a
> mistake that needs fixing after a topic hits 'next', we take it as a
> sign that the particular mistake was easy to make and hard to spot,
> and the fix for it deserves its own explanation.  Please make an
> incremental update on top of what has already been merged with a
> good explanation (explain why sub->path is "potentially unsafe" and
> why using path is better, for example).
> 
> Thanks.

So it has. I'm sorry. I was only keeping track of the 'maint' branch.

^ permalink raw reply

* Thanks
From: Tatyana Polunin @ 2023-11-22 18:43 UTC (permalink / raw)
  To: git

Thanks!

^ permalink raw reply

* git checkout -B <branch> lets you checkout a branch that is already checked out in another worktree Inbox
From: Willem Verstraeten @ 2023-11-22 19:08 UTC (permalink / raw)
  To: git

# What did you do before the bug happened? (Steps to reproduce your issue)

Clone a repo
Create an additional worktree for that clone
Use `git checkout -B branch-of-primary-clone ...` to checkout the
branch that is already checked out in the primary clone

For example, with the pathfinder repo on GitHub:

    git clone https://github.com/servo/pathfinder.git primary
    cd primary
    git worktree add -b metal ../secondary origin/metal
    cd ../secondary
    git checkout -b main #reports a fatal error, as expected
    git checkout -f main origin/main #also reports a fatal error, as expected
    git checkout -B main origin/main # ----> this succeeds, which is
unexpected <----

# What did you expect to happen? (Expected behavior)

I expected a fatal error stating that the branch could not be checked
out since it was already checked out in my primary worktree

In `git checkout --help`, it is documented that `git checkout -B` is
the atomic equivalent of `git branch -f <branch> <commit> ; git
checkout <branch>` :

> If -B is given, <new-branch> is created if it doesn’t exist; otherwise, it is reset. This is the transactional equivalent of
>
>     $ git branch -f <branch> [<start-point>]
>     $ git checkout <branch>

# What happened instead? (Actual behavior)

The branch was checked out in the secondary worktree. If I then work
and make commits in this secondary worktree, the status of my primary
worktree gets clobbered as well.

What's different between what you expected and what actually happened?

The checkout in the secondary worktree is allowed, but it shouldn't be


[System Info]
git version:
git version 2.41.0
cpu: arm64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
feature: fsmonitor--daemon
uname: Darwin 22.6.0 Darwin Kernel Version 22.6.0: Wed Jul  5 22:21:53
PDT 2023; root:xnu-8796.141.3~6/RELEASE_ARM64_T6020 arm64
compiler info: clang: 14.0.3 (clang-1403.0.22.14.1)
libc info: no libc information available
$SHELL (typically, interactive shell): /bin/zsh


[Enabled Hooks]

^ permalink raw reply

* [PATCH 0/4] Redact unsafe URLs in the Trace2 output
From: Johannes Schindelin via GitGitGadget @ 2023-11-22 19:18 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

The Trace2 output can contain secrets when a user issues a Git command with
sensitive information in the command-line. A typical (if highly discouraged)
example is: git clone https://user:password@host.com/.

With this PR, the Trace2 output redacts passwords in such URLs by default.

This series also includes a commit to temporarily disable leak checking on
t0210,t0211 because the tests uncover other unrelated bugs in Git.

These patches were integrated into Microsoft's fork of Git, as
https://github.com/microsoft/git/pull/616, and have been cooking there ever
since.

Jeff Hostetler (3):
  trace2: fix signature of trace2_def_param() macro
  t0211: test URL redacting in PERF format
  t0212: test URL redacting in EVENT format

Johannes Schindelin (1):
  trace2: redact passwords from https:// URLs by default

 t/helper/test-trace2.c   |  55 ++++++++++++++++++
 t/t0210-trace2-normal.sh |  20 ++++++-
 t/t0211-trace2-perf.sh   |  21 ++++++-
 t/t0212-trace2-event.sh  |  40 +++++++++++++
 trace2.c                 | 120 ++++++++++++++++++++++++++++++++++++++-
 trace2.h                 |   4 +-
 6 files changed, 253 insertions(+), 7 deletions(-)


base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1616%2Fdscho%2Ftrace2-redact-credentials-in-https-urls-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1616/dscho/trace2-redact-credentials-in-https-urls-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1616
-- 
gitgitgadget

^ permalink raw reply

* [PATCH 1/4] trace2: fix signature of trace2_def_param() macro
From: Jeff Hostetler via GitGitGadget @ 2023-11-22 19:18 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Jeff Hostetler
In-Reply-To: <pull.1616.git.1700680717.gitgitgadget@gmail.com>

From: Jeff Hostetler <jeffhostetler@github.com>

Add `struct key_value_info` argument to `trace2_def_param()`.

In dc90208497 (trace2: plumb config kvi, 2023-06-28) a `kvi`
argument was added to `trace2_def_param_fl()` but the macro
was not up updated. Let's fix that.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
---
 trace2.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/trace2.h b/trace2.h
index 40d8c2e02a5..1f0669bbd2d 100644
--- a/trace2.h
+++ b/trace2.h
@@ -337,8 +337,8 @@ struct key_value_info;
 void trace2_def_param_fl(const char *file, int line, const char *param,
 			 const char *value, const struct key_value_info *kvi);
 
-#define trace2_def_param(param, value) \
-	trace2_def_param_fl(__FILE__, __LINE__, (param), (value))
+#define trace2_def_param(param, value, kvi) \
+	trace2_def_param_fl(__FILE__, __LINE__, (param), (value), (kvi))
 
 /*
  * Tell trace2 about a newly instantiated repo object and assign
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 2/4] trace2: redact passwords from https:// URLs by default
From: Johannes Schindelin via GitGitGadget @ 2023-11-22 19:18 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin
In-Reply-To: <pull.1616.git.1700680717.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

It is an unsafe practice to call something like

	git clone https://user:password@example.com/

This not only risks leaking the password "over the shoulder" or into the
readline history of the current Unix shell, it also gets logged via
Trace2 if enabled.

Let's at least avoid logging such secrets via Trace2, much like we avoid
logging secrets in `http.c`. Much like the code in `http.c` is guarded
via `GIT_TRACE_REDACT` (defaulting to `true`), we guard the new code via
`GIT_TRACE2_REDACT` (also defaulting to `true`).

The new tests added in this commit uncover leaks in `builtin/clone.c`
and `remote.c`. Therefore we need to turn off
`TEST_PASSES_SANITIZE_LEAK`. The reasons:

- We observed that `the_repository->remote_status` is not released
  properly.

- We are using `url...insteadOf` and that runs into a code path where an
  allocated URL is replaced with another URL, and the original URL is
  never released.

- `remote_states` contains plenty of `struct remote`s whose refspecs
  seem to be usually allocated by never released.

More investigation is needed here to identify the exact cause and
proper fixes for these leaks/bugs.

Co-authored-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t0210-trace2-normal.sh |  20 ++++++-
 trace2.c                 | 120 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 136 insertions(+), 4 deletions(-)

diff --git a/t/t0210-trace2-normal.sh b/t/t0210-trace2-normal.sh
index 80e76a4695e..c312657a12c 100755
--- a/t/t0210-trace2-normal.sh
+++ b/t/t0210-trace2-normal.sh
@@ -2,7 +2,7 @@
 
 test_description='test trace2 facility (normal target)'
 
-TEST_PASSES_SANITIZE_LEAK=true
+TEST_PASSES_SANITIZE_LEAK=false
 . ./test-lib.sh
 
 # Turn off any inherited trace2 settings for this test.
@@ -283,4 +283,22 @@ test_expect_success 'using global config with include' '
 	test_cmp expect actual
 '
 
+test_expect_success 'unsafe URLs are redacted by default' '
+	test_when_finished \
+		"rm -r trace.normal unredacted.normal clone clone2" &&
+
+	test_config_global \
+		"url.$(pwd).insteadOf" https://user:pwd@example.com/ &&
+	test_config_global trace2.configParams "core.*,remote.*.url" &&
+
+	GIT_TRACE2="$(pwd)/trace.normal" \
+		git clone https://user:pwd@example.com/ clone &&
+	! grep user:pwd trace.normal &&
+
+	GIT_TRACE2_REDACT=0 GIT_TRACE2="$(pwd)/unredacted.normal" \
+		git clone https://user:pwd@example.com/ clone2 &&
+	grep "start .* clone https://user:pwd@example.com" unredacted.normal &&
+	grep "remote.origin.url=https://user:pwd@example.com" unredacted.normal
+'
+
 test_done
diff --git a/trace2.c b/trace2.c
index 6dc74dff4c7..87d9a3a0361 100644
--- a/trace2.c
+++ b/trace2.c
@@ -20,6 +20,7 @@
 #include "trace2/tr2_tmr.h"
 
 static int trace2_enabled;
+static int trace2_redact = 1;
 
 static int tr2_next_child_id; /* modify under lock */
 static int tr2_next_exec_id; /* modify under lock */
@@ -227,6 +228,8 @@ void trace2_initialize_fl(const char *file, int line)
 	if (!tr2_tgt_want_builtins())
 		return;
 	trace2_enabled = 1;
+	if (!git_env_bool("GIT_TRACE2_REDACT", 1))
+		trace2_redact = 0;
 
 	tr2_sid_get();
 
@@ -247,12 +250,93 @@ int trace2_is_enabled(void)
 	return trace2_enabled;
 }
 
+/*
+ * Redacts an argument, i.e. ensures that no password in
+ * https://user:password@host/-style URLs is logged.
+ *
+ * Returns the original if nothing needed to be redacted.
+ * Returns a pointer that needs to be `free()`d otherwise.
+ */
+static const char *redact_arg(const char *arg)
+{
+	const char *p, *colon;
+	size_t at;
+
+	if (!trace2_redact ||
+	    (!skip_prefix(arg, "https://", &p) &&
+	     !skip_prefix(arg, "http://", &p)))
+		return arg;
+
+	at = strcspn(p, "@/");
+	if (p[at] != '@')
+		return arg;
+
+	colon = memchr(p, ':', at);
+	if (!colon)
+		return arg;
+
+	return xstrfmt("%.*s:<REDACTED>%s", (int)(colon - arg), arg, p + at);
+}
+
+/*
+ * Redacts arguments in an argument list.
+ *
+ * Returns the original if nothing needed to be redacted.
+ * Otherwise, returns a new array that needs to be released
+ * via `free_redacted_argv()`.
+ */
+static const char **redact_argv(const char **argv)
+{
+	int i, j;
+	const char *redacted = NULL;
+	const char **ret;
+
+	if (!trace2_redact)
+		return argv;
+
+	for (i = 0; argv[i]; i++)
+		if ((redacted = redact_arg(argv[i])) != argv[i])
+			break;
+
+	if (!argv[i])
+		return argv;
+
+	for (j = 0; argv[j]; j++)
+		; /* keep counting */
+
+	ALLOC_ARRAY(ret, j + 1);
+	ret[j] = NULL;
+
+	for (j = 0; j < i; j++)
+		ret[j] = argv[j];
+	ret[i] = redacted;
+	for (++i; argv[i]; i++) {
+		redacted = redact_arg(argv[i]);
+		ret[i] = redacted ? redacted : argv[i];
+	}
+
+	return ret;
+}
+
+static void free_redacted_argv(const char **redacted, const char **argv)
+{
+	int i;
+
+	if (redacted != argv) {
+		for (i = 0; argv[i]; i++)
+			if (redacted[i] != argv[i])
+				free((void *)redacted[i]);
+		free((void *)redacted);
+	}
+}
+
 void trace2_cmd_start_fl(const char *file, int line, const char **argv)
 {
 	struct tr2_tgt *tgt_j;
 	int j;
 	uint64_t us_now;
 	uint64_t us_elapsed_absolute;
+	const char **redacted;
 
 	if (!trace2_enabled)
 		return;
@@ -260,10 +344,14 @@ void trace2_cmd_start_fl(const char *file, int line, const char **argv)
 	us_now = getnanotime() / 1000;
 	us_elapsed_absolute = tr2tls_absolute_elapsed(us_now);
 
+	redacted = redact_argv(argv);
+
 	for_each_wanted_builtin (j, tgt_j)
 		if (tgt_j->pfn_start_fl)
 			tgt_j->pfn_start_fl(file, line, us_elapsed_absolute,
-					    argv);
+					    redacted);
+
+	free_redacted_argv(redacted, argv);
 }
 
 void trace2_cmd_exit_fl(const char *file, int line, int code)
@@ -409,6 +497,7 @@ void trace2_child_start_fl(const char *file, int line,
 	int j;
 	uint64_t us_now;
 	uint64_t us_elapsed_absolute;
+	const char **orig_argv = cmd->args.v;
 
 	if (!trace2_enabled)
 		return;
@@ -419,10 +508,24 @@ void trace2_child_start_fl(const char *file, int line,
 	cmd->trace2_child_id = tr2tls_locked_increment(&tr2_next_child_id);
 	cmd->trace2_child_us_start = us_now;
 
+	/*
+	 * The `pfn_child_start_fl` API takes a `struct child_process`
+	 * rather than a simple `argv` for the child because some
+	 * targets make use of the additional context bits/values. So
+	 * temporarily replace the original argv (inside the `strvec`)
+	 * with a possibly redacted version.
+	 */
+	cmd->args.v = redact_argv(orig_argv);
+
 	for_each_wanted_builtin (j, tgt_j)
 		if (tgt_j->pfn_child_start_fl)
 			tgt_j->pfn_child_start_fl(file, line,
 						  us_elapsed_absolute, cmd);
+
+	if (cmd->args.v != orig_argv) {
+		free_redacted_argv(cmd->args.v, orig_argv);
+		cmd->args.v = orig_argv;
+	}
 }
 
 void trace2_child_exit_fl(const char *file, int line, struct child_process *cmd,
@@ -493,6 +596,7 @@ int trace2_exec_fl(const char *file, int line, const char *exe,
 	int exec_id;
 	uint64_t us_now;
 	uint64_t us_elapsed_absolute;
+	const char **redacted;
 
 	if (!trace2_enabled)
 		return -1;
@@ -502,10 +606,14 @@ int trace2_exec_fl(const char *file, int line, const char *exe,
 
 	exec_id = tr2tls_locked_increment(&tr2_next_exec_id);
 
+	redacted = redact_argv(argv);
+
 	for_each_wanted_builtin (j, tgt_j)
 		if (tgt_j->pfn_exec_fl)
 			tgt_j->pfn_exec_fl(file, line, us_elapsed_absolute,
-					   exec_id, exe, argv);
+					   exec_id, exe, redacted);
+
+	free_redacted_argv(redacted, argv);
 
 	return exec_id;
 }
@@ -637,13 +745,19 @@ void trace2_def_param_fl(const char *file, int line, const char *param,
 {
 	struct tr2_tgt *tgt_j;
 	int j;
+	const char *redacted;
 
 	if (!trace2_enabled)
 		return;
 
+	redacted = redact_arg(value);
+
 	for_each_wanted_builtin (j, tgt_j)
 		if (tgt_j->pfn_param_fl)
-			tgt_j->pfn_param_fl(file, line, param, value, kvi);
+			tgt_j->pfn_param_fl(file, line, param, redacted, kvi);
+
+	if (redacted != value)
+		free((void *)redacted);
 }
 
 void trace2_def_repo_fl(const char *file, int line, struct repository *repo)
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 3/4] t0211: test URL redacting in PERF format
From: Jeff Hostetler via GitGitGadget @ 2023-11-22 19:18 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Jeff Hostetler
In-Reply-To: <pull.1616.git.1700680717.gitgitgadget@gmail.com>

From: Jeff Hostetler <jeffhostetler@github.com>

This transmogrifies the test case that was just added to t0210, to also
cover the `GIT_TRACE2_PERF` backend.

Just like t0211, we now have to toggle the `TEST_PASSES_SANITIZE_LEAK`
annotation.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
---
 t/t0211-trace2-perf.sh | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/t/t0211-trace2-perf.sh b/t/t0211-trace2-perf.sh
index cfba6861322..290b6eaaab1 100755
--- a/t/t0211-trace2-perf.sh
+++ b/t/t0211-trace2-perf.sh
@@ -2,7 +2,7 @@
 
 test_description='test trace2 facility (perf target)'
 
-TEST_PASSES_SANITIZE_LEAK=true
+TEST_PASSES_SANITIZE_LEAK=false
 . ./test-lib.sh
 
 # Turn off any inherited trace2 settings for this test.
@@ -268,4 +268,23 @@ test_expect_success PTHREADS 'global counter test/test2' '
 	have_counter_event "main" "counter" "test" "test2" 60 actual
 '
 
+test_expect_success 'unsafe URLs are redacted by default' '
+	test_when_finished \
+		"rm -r actual trace.perf unredacted.perf clone clone2" &&
+
+	test_config_global \
+		"url.$(pwd).insteadOf" https://user:pwd@example.com/ &&
+	test_config_global trace2.configParams "core.*,remote.*.url" &&
+
+	GIT_TRACE2_PERF="$(pwd)/trace.perf" \
+		git clone https://user:pwd@example.com/ clone &&
+	! grep user:pwd trace.perf &&
+
+	GIT_TRACE2_REDACT=0 GIT_TRACE2_PERF="$(pwd)/unredacted.perf" \
+		git clone https://user:pwd@example.com/ clone2 &&
+	perl "$TEST_DIRECTORY/t0211/scrub_perf.perl" <unredacted.perf >actual &&
+	grep "d0|main|start|.* clone https://user:pwd@example.com" actual &&
+	grep "d0|main|def_param|.*|remote.origin.url:https://user:pwd@example.com" actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 4/4] t0212: test URL redacting in EVENT format
From: Jeff Hostetler via GitGitGadget @ 2023-11-22 19:18 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Jeff Hostetler
In-Reply-To: <pull.1616.git.1700680717.gitgitgadget@gmail.com>

From: Jeff Hostetler <jeffhostetler@github.com>

In the added tests cases, skip testing the `GIT_TRACE2_REDACT=0` case
because we would need to exactly model the full JSON event stream like
we did in the preceding basic tests and I do not think it is worth it.

Furthermore, the Trace2 routines print the same content in normal, perf,
or event format, and in t0210 and t0211 we already tested the basic
functionality, so no need to repeat it here.

In this test, we use the test-helper to unit test each of the event
messages where URLs can appear and confirm that they are redacted in
each event.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
---
 t/helper/test-trace2.c  | 55 +++++++++++++++++++++++++++++++++++++++++
 t/t0212-trace2-event.sh | 40 ++++++++++++++++++++++++++++++
 2 files changed, 95 insertions(+)

diff --git a/t/helper/test-trace2.c b/t/helper/test-trace2.c
index d5ca0046c89..1adac29a575 100644
--- a/t/helper/test-trace2.c
+++ b/t/helper/test-trace2.c
@@ -412,6 +412,56 @@ static int ut_201counter(int argc, const char **argv)
 	return 0;
 }
 
+static int ut_300redact_start(int argc, const char **argv)
+{
+	if (!argc)
+		die("expect <argv...>");
+
+	trace2_cmd_start(argv);
+
+	return 0;
+}
+
+static int ut_301redact_child_start(int argc, const char **argv)
+{
+	struct child_process cmd = CHILD_PROCESS_INIT;
+	int k;
+
+	if (!argc)
+		die("expect <argv...>");
+
+	for (k = 0; argv[k]; k++)
+		strvec_push(&cmd.args, argv[k]);
+
+	trace2_child_start(&cmd);
+
+	strvec_clear(&cmd.args);
+
+	return 0;
+}
+
+static int ut_302redact_exec(int argc, const char **argv)
+{
+	if (!argc)
+		die("expect <exe> <argv...>");
+
+	trace2_exec(argv[0], &argv[1]);
+
+	return 0;
+}
+
+static int ut_303redact_def_param(int argc, const char **argv)
+{
+	struct key_value_info kvi = KVI_INIT;
+
+	if (argc < 2)
+		die("expect <key> <value>");
+
+	trace2_def_param(argv[0], argv[1], &kvi);
+
+	return 0;
+}
+
 /*
  * Usage:
  *     test-tool trace2 <ut_name_1> <ut_usage_1>
@@ -438,6 +488,11 @@ static struct unit_test ut_table[] = {
 
 	{ ut_200counter,  "200counter", "<v1> [<v2> [<v3> [...]]]" },
 	{ ut_201counter,  "201counter", "<v1> <v2> <threads>" },
+
+	{ ut_300redact_start,       "300redact_start",       "<argv...>" },
+	{ ut_301redact_child_start, "301redact_child_start", "<argv...>" },
+	{ ut_302redact_exec,        "302redact_exec",        "<exe> <argv...>" },
+	{ ut_303redact_def_param,   "303redact_def_param",   "<key> <value>" },
 };
 /* clang-format on */
 
diff --git a/t/t0212-trace2-event.sh b/t/t0212-trace2-event.sh
index 6d3374ff773..147643d5826 100755
--- a/t/t0212-trace2-event.sh
+++ b/t/t0212-trace2-event.sh
@@ -323,4 +323,44 @@ test_expect_success 'discard traces when there are too many files' '
 	head -n2 trace_target_dir/git-trace2-discard | tail -n1 | grep \"event\":\"too_many_files\"
 '
 
+# In the following "...redact..." tests, skip testing the GIT_TRACE2_REDACT=0
+# case because we would need to exactly model the full JSON event stream like
+# we did in the basic tests above and I do not think it is worth it.
+
+test_expect_success 'unsafe URLs are redacted by default in cmd_start events' '
+	test_when_finished \
+		"rm -r trace.event" &&
+
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" \
+		test-tool trace2 300redact_start git clone https://user:pwd@example.com/ clone2 &&
+	! grep user:pwd trace.event
+'
+
+test_expect_success 'unsafe URLs are redacted by default in child_start events' '
+	test_when_finished \
+		"rm -r trace.event" &&
+
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" \
+		test-tool trace2 301redact_child_start git clone https://user:pwd@example.com/ clone2 &&
+	! grep user:pwd trace.event
+'
+
+test_expect_success 'unsafe URLs are redacted by default in exec events' '
+	test_when_finished \
+		"rm -r trace.event" &&
+
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" \
+		test-tool trace2 302redact_exec git clone https://user:pwd@example.com/ clone2 &&
+	! grep user:pwd trace.event
+'
+
+test_expect_success 'unsafe URLs are redacted by default in def_param events' '
+	test_when_finished \
+		"rm -r trace.event" &&
+
+	GIT_TRACE2_EVENT="$(pwd)/trace.event" \
+		test-tool trace2 303redact_def_param url https://user:pwd@example.com/ &&
+	! grep user:pwd trace.event
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related

* Re: git checkout -B <branch> lets you checkout a branch that is already checked out in another worktree Inbox
From: Junio C Hamano @ 2023-11-23  1:28 UTC (permalink / raw)
  To: Willem Verstraeten; +Cc: git
In-Reply-To: <CAGX9RpFMCVLQV7RbK2u9AabusvkZD+RZNv_UD=R00cSUrjutBg@mail.gmail.com>

Willem Verstraeten <willem.verstraeten@gmail.com> writes:

>     git checkout -b main #reports a fatal error, as expected

This is expected because "main" already exists, not because "main"
is checked out elsewhere.

>     git checkout -f main origin/main #also reports a fatal error, as expected

This is expected because origin/main is taken as pathspec, and it is
a request to checkout the paths that match the pathspec out of the
named tree-ish (i.e., "main"), even when these paths have local
changes, but you do not have paths that match "origin/main".  The
failure is not because "main" is checked out elsewhere.

A slight variant of the command

    git checkout -f -b main origin/main

still fails for the same reason as the first of your examples above.

It is a tangent, but I suspect this failure may be a bit unexpected.
In this example, "-f"orce could be overriding the usual failure from
"-b" to switch to a branch that already exists, but that is what
"-B" does, and "-f -b" does not work as a synonym for "-B".

In any case, these example you marked "fail as expected" do fail as
expected, but they fail for reasons that have nothing to do with the
protection of branches that are used in other worktrees.

>     git checkout -B main origin/main # ----> this succeeds, which is
> unexpected <----

I agree this may be undesirable.

But it makes sort of sense, because "-B" is a forced form of "-b"
(i.e., it tells git: even when "-b" would fail, take necessary
measures to make it work), and we can view that it is part of
"forcing" to override the protection over branches that are used
elsewhere.

I guess we could change the behaviour so that

    git checkout -B <branch> [<start-point>]

fails when <branch> is an existing branch that is in use in another
worktree, and allow "-f" to be used to override the safety, i.e.,

    git checkout -f -B <branch> [<start-point>]

would allow the <branch> to be repointed to <start-point> (or HEAD)
even when it is used elsewhere.

Thoughts, whether they agree or disagree with what I just said, by
other experienced contributors are very much welcome, before I can
say "patches welcome" ;-).

Willem, thanks for raising the issue.




^ permalink raw reply

* Re: [PATCH] git-p4: fix fast import when empty commit is first
From: Junio C Hamano @ 2023-11-23  1:57 UTC (permalink / raw)
  To: Alisha Kim via GitGitGadget; +Cc: git, Alisha Kim
In-Reply-To: <pull.1609.git.git.1700639764041.gitgitgadget@gmail.com>

"Alisha Kim via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Alisha Kim <pril@pril.cc>
>
> When executing p4 sync by specifying an excluded path, an empty commit
> will be created if there is only a change in the excluded path in
> revision.
> If git-p4.keepEmptyCommits is turned off and an empty commit is the
> first, fast-import will fail.

The above describe under what condition a failure gets triggered,
but there is no description on what approach the proposed solution
takes.  You could teach "fast-import" to deal with an empty commit
properly, you could ignore empty commits and not produce input for
the fast-import command, you could probably turn these initial empty
commits into non-empty commits by adding dummy contents, etc.  We
want to see in our proposed log messages what solution was taken and
how the solution was designed to satisfy what requirements.  This is
to help future developers who will have to change the code that is
given by this patch, so that their updates can still adhere to what
ever design criteria you had in working on this change [*].

    Side note: Your solution might be to ignore empty commits
    despite keepEmptyCommits option is set (as I said, you did not
    describe it at all in the above, so this is a hypothetical
    example).  If the reason behind choosing that design were "I
    just do not want it to fail---I do not care if the resulting
    history coming out of fast-import is crappy (I lose the p4 CL
    descriptions for these commits, even though the user wants to
    keep them)", then future developers can safely "fix" your fix
    here by turning the initial empty commits into non-empty ones by
    adding fake contents.

> @@ -3876,10 +3878,12 @@ class P4Sync(Command, P4UserMap):
>                              self.commit(description, filesForCommit, branch, parent)
>                  else:
>                      files = self.extractFilesFromCommit(description)
> -                    self.commit(description, files, self.branch,
> +                    isCommitted = self.commit(description, files, self.branch,
>                                  self.initialParent)
>                      # only needed once, to connect to the previous commit
> -                    self.initialParent = ""
> +                    if isCommitted:
> +                        self.initialParent = ""

"is" does not sound grammatically correct.  "didCommit" (i.e., "we
made a commit"), "haveCommitted" (i.e., "we have made a commit")
might be more understandable.

>              except IOError:
>                  print(self.gitError.read())
>                  sys.exit(1)
>
> base-commit: cfb8a6e9a93adbe81efca66e6110c9b4d2e57169

^ permalink raw reply

* Re: [commit-graph] v2.43.0 segfault with fetch.writeCommitGraph enabled when fetch
From: Taylor Blau @ 2023-11-23  3:56 UTC (permalink / raw)
  To: ryenus; +Cc: Git mailing list
In-Reply-To: <CAKkAvayLrYS1GQ_-Z7kWM=k4pCnNv1Q=NvYcvT8+wqYPkePVcw@mail.gmail.com>

On Wed, Nov 22, 2023 at 12:05:02PM +0800, ryenus wrote:
> The issue appeared after updating to git v2.43.0, now git fetch would cause
> segmentation fault when commit graph is enabled. Though I only observed this
> issue in a repo with two submodules, regardless of whether the submodules are
> checked out or not. Meanwhile most other repos without submodules worked fine.
>
> > 15:57:10.660377 run-command.c:726                 child_start[2] git maintenance run --auto --no-quiet
> > 15:57:10.665870 common-main.c:55                  version 2.43.0
> > 15:57:10.666265 common-main.c:56                  start /opt/homebrew/opt/git/libexec/git-core/git maintenance run --auto --no-quiet
> > 15:57:10.666469 repository.c:143                  worktree /path/to/repo/sub/module2
> > 15:57:10.666649 git.c:464                         cmd_name maintenance (_run_dashed_/_run_git_alias_/pull/fetch/fetch/maintenance)
> > 15:57:10.668232 git.c:723                         exit elapsed:0.003405 code:0
> > 15:57:10.668241 trace2/tr2_tgt_normal.c:127       atexit elapsed:0.003415 code:0
> > 15:57:10.668611 run-command.c:979                 child_exit[2] pid:46018 code:0 elapsed:0.008227
> > 15:57:10.668635 git.c:723                         exit elapsed:1.837179 code:0
> > 15:57:10.668639 trace2/tr2_tgt_normal.c:127       atexit elapsed:1.837182 code:0
> > 15:57:10.669007 run-command.c:979                 child_exit[3] pid:46006 code:0 elapsed:1.843739
> > 15:57:10.671522 usage.c:80                        error fetch died of signal 11
> > error: fetch died of signal 11
> > 15:57:10.671645 run-command.c:979                 child_exit[1] pid:45980 code:139 elapsed:5.292927
> > 15:57:10.671658 git.c:723                         exit elapsed:5.337363 code:1
> > 15:57:10.671663 trace2/tr2_tgt_normal.c:127       atexit elapsed:5.337368 code:1
> > 15:57:10.672048 run-command.c:979                 child_exit[1] pid:45978 code:1 elapsed:5.345050
> > 15:57:10.672099 git.c:819                         exit elapsed:5.355644 code:1
> > 15:57:10.672105 trace2/tr2_tgt_normal.c:127       atexit elapsed:5.355649 code:1

I couldn't seem to find an easy reproduction for this bug. Would you
mind sharing a copy of your repository and/or a script that reproduces
this issue? Thanks.

Thanks,
Taylor

^ permalink raw reply

* Re: git checkout -B <branch> lets you checkout a branch that is already checked out in another worktree Inbox
From: Junio C Hamano @ 2023-11-23  5:58 UTC (permalink / raw)
  To: Willem Verstraeten; +Cc: git
In-Reply-To: <xmqqjzq9cl70.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> I guess we could change the behaviour so that
>
>     git checkout -B <branch> [<start-point>]
>
> fails when <branch> is an existing branch that is in use in another
> worktree, and allow "-f" to be used to override the safety, i.e.,
>
>     git checkout -f -B <branch> [<start-point>]
>
> would allow the <branch> to be repointed to <start-point> (or HEAD)
> even when it is used elsewhere.

It turns out that for some reason "-f" is not how we decided to
override this one---there is "--ignore-other-worktrees" option.

I'll attach the first step (preparatory refactoring) to this message
below, and follow it up with the second step to implement and test
the change.

--- >8 ---
From: Junio C Hamano <gitster@pobox.com>
Date: Thu, 23 Nov 2023 14:11:41 +0900
Subject: [PATCH 1/2] checkout: refactor die_if_checked_out() caller

There is a bit dense logic to make a call to "die_if_checked_out()"
while trying to check out a branch.  Extract it into a helper
function and give it a bit of comment to describe what is going on.

The most important part of the refactoring is the separation of the
guarding logic before making the call to die_if_checked_out() into
the caller specific part (e.g., the logic that decides that the
caller is trying to check out an existing branch) and the bypass due
to the "--ignore-other-worktrees" option.  The latter will be common
no matter how the current or future callers decides they need this
protection.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/checkout.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index f02434bc15..b4ab972c5a 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1516,6 +1516,26 @@ static void die_if_some_operation_in_progress(void)
 	wt_status_state_free_buffers(&state);
 }
 
+/*
+ * die if attempting to checkout an existing branch that is in use
+ * in another worktree, unless ignore-other-wortrees option is given.
+ * The check is bypassed when the branch is already the current one,
+ * as it will not make things any worse.
+ */
+static void die_if_switching_to_a_branch_in_use(struct checkout_opts *opts,
+						const char *full_ref)
+{
+	int flags;
+	char *head_ref;
+
+	if (opts->ignore_other_worktrees)
+		return;
+	head_ref = resolve_refdup("HEAD", 0, NULL, &flags);
+	if (head_ref && (!(flags & REF_ISSYMREF) || strcmp(head_ref, full_ref)))
+		die_if_checked_out(full_ref, 1);
+	free(head_ref);
+}
+
 static int checkout_branch(struct checkout_opts *opts,
 			   struct branch_info *new_branch_info)
 {
@@ -1576,15 +1596,9 @@ static int checkout_branch(struct checkout_opts *opts,
 	if (!opts->can_switch_when_in_progress)
 		die_if_some_operation_in_progress();
 
-	if (new_branch_info->path && !opts->force_detach && !opts->new_branch &&
-	    !opts->ignore_other_worktrees) {
-		int flag;
-		char *head_ref = resolve_refdup("HEAD", 0, NULL, &flag);
-		if (head_ref &&
-		    (!(flag & REF_ISSYMREF) || strcmp(head_ref, new_branch_info->path)))
-			die_if_checked_out(new_branch_info->path, 1);
-		free(head_ref);
-	}
+	/* "git checkout <branch>" */
+	if (new_branch_info->path && !opts->force_detach && !opts->new_branch)
+		die_if_switching_to_a_branch_in_use(opts, new_branch_info->path);
 
 	if (!new_branch_info->commit && opts->new_branch) {
 		struct object_id rev;
-- 
2.43.0


^ permalink raw reply related

* [PATCH 2/2] checkout: forbid "-B <branch>" from touching a branch used elsewhere
From: Junio C Hamano @ 2023-11-23  6:00 UTC (permalink / raw)
  To: Willem Verstraeten; +Cc: git
In-Reply-To: <xmqqv89tau3r.fsf@gitster.g>

"git checkout -B <branch> [<start-point>]", being a "forced" version
of "-b", switches to the <branch>, after optionally resetting its
tip to the <start-point>, even if the <branch> is in use in another
worktree, which is somewhat unexpected.

Protect the <branch> using the same logic that forbids "git checkout
<branch>" from touching a branch that is in use elsewhere.

This is a breaking change that may deserve backward compatibliity
warning in the Release Notes.  The "--ignore-other-worktrees" option
can be used as an escape hatch if the finger memory of existing
users depend on the current behaviour of "-B".

Reported-by: Willem Verstraeten <willem.verstraeten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * The documentation might also need updates, but I didn't look at.

 builtin/checkout.c      | 7 +++++++
 t/t2060-switch.sh       | 2 ++
 t/t2400-worktree-add.sh | 8 ++++++++
 3 files changed, 17 insertions(+)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index b4ab972c5a..8a8ad23e98 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -1600,6 +1600,13 @@ static int checkout_branch(struct checkout_opts *opts,
 	if (new_branch_info->path && !opts->force_detach && !opts->new_branch)
 		die_if_switching_to_a_branch_in_use(opts, new_branch_info->path);
 
+	/* "git checkout -B <branch>" */
+	if (opts->new_branch_force) {
+		char *full_ref = xstrfmt("refs/heads/%s", opts->new_branch);
+		die_if_switching_to_a_branch_in_use(opts, full_ref);
+		free(full_ref);
+	}
+
 	if (!new_branch_info->commit && opts->new_branch) {
 		struct object_id rev;
 		int flag;
diff --git a/t/t2060-switch.sh b/t/t2060-switch.sh
index e247a4735b..c91c4db936 100755
--- a/t/t2060-switch.sh
+++ b/t/t2060-switch.sh
@@ -170,8 +170,10 @@ test_expect_success 'switch back when temporarily detached and checked out elsew
 	# we test in both worktrees to ensure that works
 	# as expected with "first" and "next" worktrees
 	test_must_fail git -C wt1 switch shared &&
+	test_must_fail git -C wt1 switch -C shared &&
 	git -C wt1 switch --ignore-other-worktrees shared &&
 	test_must_fail git -C wt2 switch shared &&
+	test_must_fail git -C wt2 switch -C shared &&
 	git -C wt2 switch --ignore-other-worktrees shared
 '
 
diff --git a/t/t2400-worktree-add.sh b/t/t2400-worktree-add.sh
index df4aff7825..bbcb2d3419 100755
--- a/t/t2400-worktree-add.sh
+++ b/t/t2400-worktree-add.sh
@@ -126,6 +126,14 @@ test_expect_success 'die the same branch is already checked out' '
 	)
 '
 
+test_expect_success 'refuse to reset a branch in use elsewhere' '
+	(
+		cd here &&
+		test_must_fail git checkout -B newmain 2>actual &&
+		grep "already used by worktree at" actual
+	)
+'
+
 test_expect_success SYMLINKS 'die the same branch is already checked out (symlink)' '
 	head=$(git -C there rev-parse --git-path HEAD) &&
 	ref=$(git -C there symbolic-ref HEAD) &&
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH 1/4] trace2: fix signature of trace2_def_param() macro
From: Junio C Hamano @ 2023-11-23  6:10 UTC (permalink / raw)
  To: Jeff Hostetler via GitGitGadget; +Cc: git, Johannes Schindelin, Jeff Hostetler
In-Reply-To: <97d17c22ff310c26c3ec391c7bf870e7e5bab4f8.1700680717.git.gitgitgadget@gmail.com>

"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Jeff Hostetler <jeffhostetler@github.com>
>
> Add `struct key_value_info` argument to `trace2_def_param()`.
>
> In dc90208497 (trace2: plumb config kvi, 2023-06-28) a `kvi`
> argument was added to `trace2_def_param_fl()` but the macro
> was not up updated. Let's fix that.
>
> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
> ---
>  trace2.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/trace2.h b/trace2.h
> index 40d8c2e02a5..1f0669bbd2d 100644
> --- a/trace2.h
> +++ b/trace2.h
> @@ -337,8 +337,8 @@ struct key_value_info;
>  void trace2_def_param_fl(const char *file, int line, const char *param,
>  			 const char *value, const struct key_value_info *kvi);
>  
> -#define trace2_def_param(param, value) \
> -	trace2_def_param_fl(__FILE__, __LINE__, (param), (value))
> +#define trace2_def_param(param, value, kvi) \
> +	trace2_def_param_fl(__FILE__, __LINE__, (param), (value), (kvi))

IOW, this macro was not used back when it was updated, and nobody
used it since then?  

I briefly wondered if we are better off removing it but that does
not make sense because you are adding a new (and only) user to it.

Will queue.  Thanks.

>  
>  /*
>   * Tell trace2 about a newly instantiated repo object and assign

^ permalink raw reply

* Re: [PATCH v2] commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
From: Junio C Hamano @ 2023-11-23 11:44 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Taylor Blau, Jeff King
In-Reply-To: <f1235741cea5866e67c83aca83a760e0cdde8730.1700478031.git.ps@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> Note that this requires us to amend some tests to manually turn on the
> paranoid checks again. This is because we cause repository corruption by
> manually deleting objects which are part of the commit graph already.
> These circumstances shouldn't usually happen in repositories.
> ...
> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
> index 40265a4f66..1ca4eb5a36 100755
> --- a/t/t6022-rev-list-missing.sh
> +++ b/t/t6022-rev-list-missing.sh
> @@ -13,6 +13,11 @@ test_expect_success 'create repository and alternate directory' '
>  	test_commit 3
>  '
>  
> +# We manually corrupt the repository, which means that the commit-graph may
> +# contain references to already-deleted objects. We thus need to enable
> +# commit-graph paranoia to not returned these deleted commits from the graph.
> +export GIT_COMMIT_GRAPH_PARANOIA=true

test-lint-shell-syntax is a bit overly strict and complains against
this line, so until it is loosened, I'd suggest to do

	GIT_COMMIT_GRAPH_PARANOIA=true
	export GIT_COMMIT_GRAPH_PARANOIA

instead here.


^ permalink raw reply

* Re: git checkout -B <branch> lets you checkout a branch that is already checked out in another worktree Inbox
From: Willem Verstraeten @ 2023-11-23 12:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqjzq9cl70.fsf@gitster.g>

> >     git checkout -f main origin/main #also reports a fatal error, as expected
>
> This is expected because origin/main is taken as pathspec, and it is
> a request to checkout the paths that match the pathspec out of the
> named tree-ish (i.e., "main"), even when these paths have local
> changes, but you do not have paths that match "origin/main".  The
> failure is not because "main" is checked out elsewhere.
>

My mistake: I meant to do `git branch -f main origin/main`, as
documented for `git checkout -B main origin/main`

So, for completeness' sake, this is my revised reproduction scenario
that actually demonstrates the problem I have:

    ~/temp> git clone https://github.com/servo/pathfinder.git primary
    Cloning into 'primary'...
    ....

    ~/temp> cd primary

    ~/temp/primary> git worktree add -b metal ../secondary origin/metal
    Preparing worktree (new branch 'metal')
    branch 'metal' set up to track 'origin/metal'.
    HEAD is now at 1cdcc209 wip

    ~/temp/primary> cd ..\secondary\

    ~/temp/secondary> git checkout main
    fatal: 'main' is already used by worktree at ../primary'

    ~/temp/secondary> git branch -f main origin/main
    fatal: cannot force update the branch 'main' used by worktree at
'../primary'

    ~/temp/secondary> git checkout -B main origin/main
    Switched to and reset branch 'main'
    branch 'main' set up to track 'origin/main'.
    Your branch is up to date with 'origin/main'.

I would expect that last `git checkout -B ...` to fail with a similar
error as the `git branch -f ...` command right before that, since the
documentation for `git checkout -B <branch> <start-point>` states that
it is the atomic equivalent of `git branch -f <branch> <start-point> ;
git checkout <branch>`


> I guess we could change the behaviour so that
>
>     git checkout -B <branch> [<start-point>]
>
> fails when <branch> is an existing branch that is in use in another
> worktree, and allow "-f" to be used to override the safety, i.e.,
>
>     git checkout -f -B <branch> [<start-point>]

I would be very much in favor of that, indeed.

However, as you noted in your follow-up mail, the
--ignore-other-worktrees option would be better suited than the -f
flag.

> It turns out that for some reason "-f" is not how we decided to
> override this one---there is "--ignore-other-worktrees" option.

This means it would look like this then, if you decide to tackle this?

    ~/temp/secondary> git checkout -B metal origin/metal
    fatal: cannot force update the branch 'main' used by worktree at
'../primary'

    ~/temp/secondary> git checkout --ignore-other-worktrees -B metal
origin/metal
    Switched to and reset branch 'metal'
    branch 'metal' set up to track 'origin/metal'.
    Your branch is up to date with 'origin/metal'.

My thoughts as an experienced user, though not an experienced
contributor, admittedly :)

Kind regards,
Willem Verstraeten

On Thu, 23 Nov 2023 at 02:28, Junio C Hamano <gitster@pobox.com> wrote:
>
> Willem Verstraeten <willem.verstraeten@gmail.com> writes:
>
> >     git checkout -b main #reports a fatal error, as expected
>
> This is expected because "main" already exists, not because "main"
> is checked out elsewhere.
>
> >     git checkout -f main origin/main #also reports a fatal error, as expected
>
> This is expected because origin/main is taken as pathspec, and it is
> a request to checkout the paths that match the pathspec out of the
> named tree-ish (i.e., "main"), even when these paths have local
> changes, but you do not have paths that match "origin/main".  The
> failure is not because "main" is checked out elsewhere.
>
> A slight variant of the command
>
>     git checkout -f -b main origin/main
>
> still fails for the same reason as the first of your examples above.
>
> It is a tangent, but I suspect this failure may be a bit unexpected.
> In this example, "-f"orce could be overriding the usual failure from
> "-b" to switch to a branch that already exists, but that is what
> "-B" does, and "-f -b" does not work as a synonym for "-B".
>
> In any case, these example you marked "fail as expected" do fail as
> expected, but they fail for reasons that have nothing to do with the
> protection of branches that are used in other worktrees.
>
> >     git checkout -B main origin/main # ----> this succeeds, which is
> > unexpected <----
>
> I agree this may be undesirable.
>
> But it makes sort of sense, because "-B" is a forced form of "-b"
> (i.e., it tells git: even when "-b" would fail, take necessary
> measures to make it work), and we can view that it is part of
> "forcing" to override the protection over branches that are used
> elsewhere.
>
> I guess we could change the behaviour so that
>
>     git checkout -B <branch> [<start-point>]
>
> fails when <branch> is an existing branch that is in use in another
> worktree, and allow "-f" to be used to override the safety, i.e.,
>
>     git checkout -f -B <branch> [<start-point>]
>
> would allow the <branch> to be repointed to <start-point> (or HEAD)
> even when it is used elsewhere.
>
> Thoughts, whether they agree or disagree with what I just said, by
> other experienced contributors are very much welcome, before I can
> say "patches welcome" ;-).
>
> Willem, thanks for raising the issue.
>
>
>

^ permalink raw reply

* [BUG?] Semantics of HEAD when bare cloning non-bare repositories
From: guillaume.yziquel @ 2023-11-23 12:13 UTC (permalink / raw)
  To: git

Hi.

Observation: When git clone --bare is used on a non-bare repo, the HEAD is put in ./refs/HEAD, and conflicts with ./HEAD in the sense that they may end up decorelated over time.

Suggestion: When bare cloning on a non-bare repo, do not put the HEAD of the non-bare repo in refs/HEAD like every branch, but treat it separately: put it in ./HEAD, not ./refs/HEAD.

Context: I was setting up an alternate repository for cargo crates, which requires a git repo to hold the index. So I wrote my config.json in a non-bare git repo, and "deployed" as a bare repo by git clone --bare. This created a ./refs/HEAD. However, the cargo crates repository I deployed (ktra) also updated the HEAD of that repo whenever I published private crates on it. So HEAD and refs/HEAD were not in sync. And, in the end, these got unsynchronised and I could not perform cargo fetch or cargo build with this alternate repo.

Ratonale: HEAD is just not a normal branch, semantically. It is what you have at the tip of your fingers in the shell with non-bare git repos. Treating as "just as all other branches" may be programatically convenient, but, in the end, it bypasses the fact that it just is not quite a normal branch. As the problem I experienced highlights. Which is why I would argue in favour of a distinguished treatment of HEADs when bare cloning a non-bare repo.

Best regards,

Guillaume Yziquel.

^ permalink raw reply

* Re: [PATCH] object-name: reject too-deep recursive ancestor queries
From: Patrick Steinhardt @ 2023-11-23 13:53 UTC (permalink / raw)
  To: Taylor Blau
  Cc: git, Jeff King, Junio C Hamano,
	Carlos Andrés Ramírez Cataño
In-Reply-To: <57c0b30ddfe7c0ae78069682ff8454791e54469f.1700496801.git.me@ttaylorr.com>

[-- Attachment #1: Type: text/plain, Size: 9200 bytes --]

On Mon, Nov 20, 2023 at 11:13:45AM -0500, Taylor Blau wrote:
> When trying to resolve a revision query like "HEAD~~~~~", our call
> pattern looks something like:
> 
>   - object-name.c::get_oid_with_context()
>   - object-name.c::get_oid_1()
>   - object-name.c::get_nth_ancestor()
>   - object-name.c::get_oid_1()
>   - ...
> 
> With `get_nth_ancestor()` and `get_oid_1()` mutually recurring, popping
> one '~' off of the revision query for each round of the recursion.

One thing I noticed just now is that we have exactly the same problem
with `^`, just with a different callstack. This problem isn't yet
addressed by your patch.

I have to wonder whether we should tighten restrictions even further:
instead of manually keeping track of how deep in the stack we are, we
limit the length of revisions to at most 1MB. I would claim that this
limit is sufficiently large to never be a problem in practice. Revisions
are limited to 4kB on most platforms anyway due to the maximum path
length. And if somebody really wants to request the (1024 * 1024) + 1th
parent, they shouldn't do that by appending this many "~" or "^" chars,
but instead they should ask for "~1048577" or "^1048577".

I realize that this is much more restrictive than the current patch. But
it would be a good defensive mechanism against all kinds of weird revs,
and I am very certain that there are other ways to blow the stack or
cause out-of-bounds reads or writes here.

Patrick

> Since this recursive behavior is unbounded, having too many "~"'s
> contained in a revision query will cause us to blow the stack.
> Generating a message like this when compiled under SANITIZE=address:
> 
>     $ valgrind git rev-parse "HEAD$(perl -e "print \"~\" x 1000000000000")"
>     ==597453== Memcheck, a memory error detector
>     ==597453== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
>     ==597453== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
>     ==597453== Command: /home/ttaylorr/local/bin/git.compile diff HEAD~~~~~~~~~~~~[...]
>     ==597453==
>     AddressSanitizer:DEADLYSIGNAL
>     =================================================================
>     ==597453==ERROR: AddressSanitizer: stack-overflow on address 0x7fffdd838ff8 (pc 0x7f2726082748 bp 0x7fffdd839110 sp 0x7fffdd839000 T0)
>         #0 0x7f2726082748 in __asan::GetTLSFakeStack() ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:176
>         #1 0x7f2726082748 in GetFakeStackFast ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:193
>         #2 0x7f27260833de in OnMalloc ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:207
>         #3 0x7f27260833de in __asan_stack_malloc_1 ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:256
>         #4 0x563f9077d9d8 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1087
>         #5 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295
>         #6 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092
>         #7 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295
>         #8 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092
>         [...]
>         #247 0x563f9077e957 in get_oid_1 /home/ttaylorr/src/git/object-name.c:1295
>         #248 0x563f9077da64 in get_nth_ancestor /home/ttaylorr/src/git/object-name.c:1092
> 
>     SUMMARY: AddressSanitizer: stack-overflow ../../../../src/libsanitizer/asan/asan_fake_stack.cpp:176 in __asan::GetTLSFakeStack()
>     ==597453==ABORTING
> 
> (Note that the actual stack is much deeper. GDB reports that the bottom
> of the stack looks something like the following):
> 
>     #54866 0x0000555555c6d3bf in get_oid_with_context_1 (repo=0x5555563849a0 <the_repo>, name=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., flags=128, prefix=0x0, oid=0x7ffff5713d40, oc=0x7ffff5713d90) at object-name.c:1947
>     #54867 0x0000555555c6e2fa in get_oid_with_context (repo=0x5555563849a0 <the_repo>, str=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., flags=128, oid=0x7ffff5713d40, oc=0x7ffff5713d90) at object-name.c:2096
>     #54868 0x0000555555d8eed8 in handle_revision_arg_1 (arg_=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., revs=0x7ffff5b000d0, flags=0, revarg_opt=0) at revision.c:2174
>     #54869 0x0000555555d8f1a9 in handle_revision_arg (arg=0x7fffffff4be5 "HEAD", '~' <repeats 196 times>..., revs=0x7ffff5b000d0, flags=0, revarg_opt=0) at revision.c:2189
>     #54870 0x0000555555d97ca9 in setup_revisions (argc=2, argv=0x7fffffff4970, revs=0x7ffff5b000d0, opt=0x0) at revision.c:2932
>     #54871 0x00005555557d6a63 in cmd_diff (argc=2, argv=0x7fffffff4970, prefix=0x0) at builtin/diff.c:502
>     #54872 0x00005555557367bf in run_builtin (p=0x5555561c4c30 <commands+816>, argc=2, argv=0x7fffffff4970) at git.c:469
>     #54873 0x000055555573716b in handle_builtin (argc=2, argv=0x7fffffff4970) at git.c:723
>     #54874 0x000055555573785a in run_argv (argcp=0x7ffff56028b0, argv=0x7ffff56028e0) at git.c:787
>     #54875 0x0000555555738626 in cmd_main (argc=2, argv=0x7fffffff4970) at git.c:922
>     #54876 0x00005555559d3fdd in main (argc=3, argv=0x7fffffff4968) at common-main.c:62
> 
> Fortunately, we can impose a limit on the maximum recursion depth we're
> willing to accept when resolving queries like the above without
> significantly impeding users. This patch sets the limit at 4096, though
> we could probably increase that limit depending on the size of each
> frame.
> 
> The limit introduced here is large enough that any reasonable query
> should still run to completion, but small enough that if the frame size
> were to significantly increase, our protection would still be effective.
> 
> The change here is straightforward: each call to get_nth_ancestor()
> increases a counter, and then decrements that counter before returning.
> 
> The diff is a little noisy since there are a handful of return paths
> from `get_nth_ancestor()`, all of which need to decrement the depth
> variable.
> 
> Since this is a local-only exploit, a user would have to be tricked into
> running such a query by an adversary. Even if they were successfully
> tricked into running the malicious query, the blast radius is limited to
> a local stack overflow, which does not have meaningful paths to remote
> code execution, arbitrary memory reads, or any more grave security
> concerns.
> 
> Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  object-name.c                  | 26 ++++++++++++++++++++------
>  t/t1506-rev-parse-diagnosis.sh |  5 +++++
>  2 files changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/object-name.c b/object-name.c
> index 0bfa29dbbf..675e0a759e 100644
> --- a/object-name.c
> +++ b/object-name.c
> @@ -1080,6 +1080,9 @@ static enum get_oid_result get_parent(struct repository *r,
>  	return MISSING_OBJECT;
>  }
>  
> +static int get_nth_ancestor_max_depth = 4096;
> +static int get_nth_ancestor_curr_depth;
> +
>  static enum get_oid_result get_nth_ancestor(struct repository *r,
>  					    const char *name, int len,
>  					    struct object_id *result,
> @@ -1089,20 +1092,31 @@ static enum get_oid_result get_nth_ancestor(struct repository *r,
>  	struct commit *commit;
>  	int ret;
>  
> +	if (++get_nth_ancestor_curr_depth > get_nth_ancestor_max_depth)
> +		 return error(_("exceeded maximum ancestor depth"));
> +
>  	ret = get_oid_1(r, name, len, &oid, GET_OID_COMMITTISH);
>  	if (ret)
> -		return ret;
> +		goto done;
>  	commit = lookup_commit_reference(r, &oid);
> -	if (!commit)
> -		return MISSING_OBJECT;
> +	if (!commit) {
> +		ret = MISSING_OBJECT;
> +		goto done;
> +	}
>  
>  	while (generation--) {
> -		if (repo_parse_commit(r, commit) || !commit->parents)
> -			return MISSING_OBJECT;
> +		if (repo_parse_commit(r, commit) || !commit->parents) {
> +			ret = MISSING_OBJECT;
> +			goto done;
> +		}
>  		commit = commit->parents->item;
>  	}
>  	oidcpy(result, &commit->object.oid);
> -	return FOUND;
> +
> +	ret = FOUND;
> +done:
> +	get_nth_ancestor_curr_depth--;
> +	return ret;
>  }
>  
>  struct object *repo_peel_to_type(struct repository *r, const char *name, int namelen,
> diff --git a/t/t1506-rev-parse-diagnosis.sh b/t/t1506-rev-parse-diagnosis.sh
> index ef40511d89..b3b9f6c8c5 100755
> --- a/t/t1506-rev-parse-diagnosis.sh
> +++ b/t/t1506-rev-parse-diagnosis.sh
> @@ -244,6 +244,11 @@ test_expect_success 'reject Nth ancestor if N is too high' '
>  	test_must_fail git rev-parse HEAD~100000000000000000000000000000000
>  '
>  
> +test_expect_success 'reject too-deep recursive ancestor queries' '
> +	test_must_fail git rev-parse "HEAD$(perl -e "print \"~\" x 4097")" 2>err &&
> +	grep "error: exceeded maximum ancestor depth" err
> +'
> +
>  test_expect_success 'pathspecs with wildcards are not ambiguous' '
>  	echo "*.c" >expect &&
>  	git rev-parse "*.c" >actual &&
> 
> base-commit: cfb8a6e9a93adbe81efca66e6110c9b4d2e57169
> -- 
> 2.43.0.rc2.19.geadd45bf00

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] checkout: forbid "-B <branch>" from touching a branch used elsewhere
From: Phillip Wood @ 2023-11-23 16:33 UTC (permalink / raw)
  To: Junio C Hamano, Willem Verstraeten; +Cc: git
In-Reply-To: <xmqqpm01au0w.fsf_-_@gitster.g>

Hi Junio

On 23/11/2023 06:00, Junio C Hamano wrote:
> "git checkout -B <branch> [<start-point>]", being a "forced" version
> of "-b", switches to the <branch>, after optionally resetting its
> tip to the <start-point>, even if the <branch> is in use in another
> worktree, which is somewhat unexpected.
> 
> Protect the <branch> using the same logic that forbids "git checkout
> <branch>" from touching a branch that is in use elsewhere.
> 
> This is a breaking change that may deserve backward compatibliity
> warning in the Release Notes.  The "--ignore-other-worktrees" option
> can be used as an escape hatch if the finger memory of existing
> users depend on the current behaviour of "-B".

I think this change makes sense and I found the implementation here much 
easier to understand than a previous attempt at 
https://lore.kernel.org/git/20230120113553.24655-1-carenas@gmail.com/

> Reported-by: Willem Verstraeten <willem.verstraeten@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> 
>   * The documentation might also need updates, but I didn't look at.

This option is documented as an atomic version of

	git branch -f <branch> [<start-point>]
	git checkout <branch>

However "git branch -f <branch>" will fail if the branch is checked out 
in the current worktree whereas "git checkout -B" succeeds. I think 
allowing the checkout in that case makes sense for "git checkout -B" but 
it does mean that description is not strictly accurate. I'm not sure it 
matters that much though.

The documentation for "switch -C" is a bit lacking compared to "checkout 
-B" but that is a separate problem.

> 
>   builtin/checkout.c      | 7 +++++++
>   t/t2060-switch.sh       | 2 ++
>   t/t2400-worktree-add.sh | 8 ++++++++
>   3 files changed, 17 insertions(+)
> 
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index b4ab972c5a..8a8ad23e98 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -1600,6 +1600,13 @@ static int checkout_branch(struct checkout_opts *opts,
>   	if (new_branch_info->path && !opts->force_detach && !opts->new_branch)
>   		die_if_switching_to_a_branch_in_use(opts, new_branch_info->path);
>   
> +	/* "git checkout -B <branch>" */
> +	if (opts->new_branch_force) {
> +		char *full_ref = xstrfmt("refs/heads/%s", opts->new_branch);
> +		die_if_switching_to_a_branch_in_use(opts, full_ref);
> +		free(full_ref);

At the moment this is academic as neither of the test scripts changed by 
this patch are leak free and so I don't think we need to worry about it 
but it raises an interesting question about how we should handle memory 
leaks when dying. Leaving the leak when dying means that a test script 
that tests an expected failure will never be leak free but using 
UNLEAK() would mean we miss a leak being introduced in the successful 
case should the call to "free()" ever be removed. We could of course 
rename die_if_checked_out() to error_if_checked_out() and return an 
error instead of dying but that seems like a lot of churn just to keep 
the leak checker happy.

Best Wishes

Phillip

> +	}
> +
>   	if (!new_branch_info->commit && opts->new_branch) {
>   		struct object_id rev;
>   		int flag;
> diff --git a/t/t2060-switch.sh b/t/t2060-switch.sh
> index e247a4735b..c91c4db936 100755
> --- a/t/t2060-switch.sh
> +++ b/t/t2060-switch.sh
> @@ -170,8 +170,10 @@ test_expect_success 'switch back when temporarily detached and checked out elsew
>   	# we test in both worktrees to ensure that works
>   	# as expected with "first" and "next" worktrees
>   	test_must_fail git -C wt1 switch shared &&
> +	test_must_fail git -C wt1 switch -C shared &&
>   	git -C wt1 switch --ignore-other-worktrees shared &&
>   	test_must_fail git -C wt2 switch shared &&
> +	test_must_fail git -C wt2 switch -C shared &&
>   	git -C wt2 switch --ignore-other-worktrees shared
>   '
>   
> diff --git a/t/t2400-worktree-add.sh b/t/t2400-worktree-add.sh
> index df4aff7825..bbcb2d3419 100755
> --- a/t/t2400-worktree-add.sh
> +++ b/t/t2400-worktree-add.sh
> @@ -126,6 +126,14 @@ test_expect_success 'die the same branch is already checked out' '
>   	)
>   '
>   
> +test_expect_success 'refuse to reset a branch in use elsewhere' '
> +	(
> +		cd here &&
> +		test_must_fail git checkout -B newmain 2>actual &&
> +		grep "already used by worktree at" actual
> +	)
> +'
> +
>   test_expect_success SYMLINKS 'die the same branch is already checked out (symlink)' '
>   	head=$(git -C there rev-parse --git-path HEAD) &&
>   	ref=$(git -C there symbolic-ref HEAD) &&

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox