Git development
 help / color / mirror / Atom feed
* Usema
From: Monchi Asenov @ 2023-01-24  6:42 UTC (permalink / raw)
  To: git



Inviato da iPhone

^ permalink raw reply

* Re: [PATCH v4 5/9] rebase: add coverage of other incompatible options
From: Phillip Wood @ 2023-01-24 10:27 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, git, Derrick Stolee,
	Eric Sunshine, Martin Ågren, Phillip Wood
In-Reply-To: <CABPp-BE+wRgjmWknARQpNsdUFjNOz0ND9wgx_-_RTyK+EwJjXA@mail.gmail.com>

Hi Elijah

On 24/01/2023 02:36, Elijah Newren wrote:
> Hi Phillip,
> 
> On Mon, Jan 23, 2023 at 12:08 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>>
>> Hi Elijah
>>
>> On 22/01/2023 06:12, Elijah Newren via GitGitGadget wrote:
>>> From: Elijah Newren <newren@gmail.com>
>>>
>>> The git-rebase manual noted several sets of incompatible options, but
>>> we were missing tests for a few of these.  Further, we were missing
>>> code checks for some of these, which could result in command line
>>> options being silently ignored.
>>>
>>> Also, note that adding a check for autosquash means that using
>>> --whitespace=fix together with the config setting rebase.autosquash=true
>>> will trigger an error.  A subsequent commit will improve the error
>>> message.
>>
>> Thanks for updating the commit message and for the new commits at the
>> end of the series.
>>
>>> Signed-off-by: Elijah Newren <newren@gmail.com>
>>> --- a/builtin/rebase.c
>>> +++ b/builtin/rebase.c
>>> @@ -1224,6 +1224,26 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>>>                if (options.fork_point < 0)
>>>                        options.fork_point = 0;
>>>        }
>>> +     /*
>>> +      * The apply backend does not support --[no-]reapply-cherry-picks.
>>> +      * The behavior it implements by default is equivalent to
>>> +      * --no-reapply-cherry-picks (due to passing --cherry-picks to
>>> +      * format-patch), but --keep-base alters the upstream such that no
>>> +      * cherry-picks can be found (effectively making it act like
>>> +      * --reapply-cherry-picks).
>>> +      *
>>> +      * Now, if the user does specify --[no-]reapply-cherry-picks, but
>>> +      * does so in such a way that options.reapply_cherry_picks ==
>>> +      * keep_base, then the behavior they get will match what they
>>> +      * expect despite options.reapply_cherry_picks being ignored.  We
>>> +      * could just allow the flag in that case, but it seems better to
>>> +      * just alert the user that they've specified a flag that the
>>> +      * backend ignores.
>>> +      */
>>
>> I'm a bit confused by this. --keep-base works with either
>> --reapply-cherry-picks (which is the default if --keep-base is given) or
>> --no-reapply-cherry-picks. Just below this hunk we have
>>
>>          if (options.reapply_cherry_picks < 0)
>>                  options.reapply_cherry_picks = keep_base;
>>
>> So we only set options.reapply_cherry_picks to match keep_base if the
>> user did not specify -[-no]-reapply-cherry-picks on the commandline.
> 
> options.reapply_cherry_picks is totally ignored by the apply backend,
> regardless of whether it's set by the user or the setup code in
> builtin/rebase.c.  And if we have an option which is ignored, isn't it
> nicer to provide an error message to the user if they tried to set it?
> 
> Said another way, while users could start with these command lines:
> 
>      (Y) git rebase --whitespace=fix
>      (Z) git rebase --whitespace=fix --keep-base
> 
> and modify them to include flags that would be ignored, we could allow:
> 
>      (A) git rebase --whitespace=fix --no-reapply-cherry-picks
>      (B) git rebase --whitespace=fix --keep-base --reapply-cherry-picks
> 
> But we could not allow commands like
> 
>      (C) git rebase --whitespace=fix --reapply-cherry-picks
>      (D) git rebase --whitespace=fix --keep-base --no-reapply-cherry-picks

(C) is already an error
(D) is currently allowed and I think works as expected (--keep-base only 
implies --reapply-cherry-picks, the user is free to override that with 
--no-reapply-cherry-picks) so I don't see why we'd want to make it an error.

> For all four cases (A)-(D), the apply backend will ignore whatever
> --[no-]reapply-cherry-picks flag is provided.

For (D) the flag is respected, (C) errors out, the other cases 
correspond to the default so it's like saying

	git rebase --merge --no-reapply-cherry-picks

ignores the flag. Arguably it is confusing that the apply backend only 
supports -[-no]-reapply-cherry-picks if --keep-base is given but I'm not 
sure that is a good reason to reject a combination that currently works 
as expected.

Best Wishes

Phillip

> For (A) and (B), the > behavior the apply backend provides happens to match what the user
> is requesting, while for (C) and (D) the behavior does not match.
> So we should at least reject (C) and (D).  But, although we could
> technically allow (A) and (B), what advantage would it provide?  I
> think the results of allowing those two commands would be:
> 
>      1) Confusion by end users -- why should (C) & (D) throw errors if
>         (A) and (B) are accepted?  That's not an easy rule to understand.
> 
>      2) More confusion by end users -- the documentation for years has
>         stated that --reapply-cherry-picks is incompatible with the apply
>         backend, suggesting users would be surprised if at least (B) and
>         probably (A) didn't throw error messages.
> 
>      3) Confusing documentation -- If we don't want to throw errors for
>         (A) and (B), how do we modify the "INCOMPATIBLE OPTIONS" section
>         of Documentation/git-rebase.txt to explain the relevant details
>         of when these flags are (or are not) incompatible with the apply
>         backend?   I think it'd end up with a very verbose explanation
>         that likely confuses more than it helps.
> 
>      4) Excessively complicated code -- The previous attempts to
>         implement this got it wrong.  Prior to ce5238a690 ("rebase
>         --keep-base: imply --reapply-cherry-picks", 2022-10-17), the code
>         would error out on (B) and (C).  After that commit, it would only
>         error out on (C).  Both solutions are incorrect since they miss
>         (D), and I think the code just becomes hard to hard to follow in
>         order to only error out on both (C) and (D) without (A) and (B).
> 
> (#2 and #3 might just be a repeat of the same issue, documentation,
> but it seemed easier to write separately.)
> 
> I think it's simpler for the code, for the documentation, and for end
> users to just error out on all of (A), (B), (C), and (D).
>   --[no-]reapply-cherry-picks is not supported by the apply backend.
> 
> But, given this lengthy email, perhaps I should split out the handling
> of --[no-]reapply-cherry-picks into its own commit and copy some or
> all of the description above into the commit message?

^ permalink raw reply

* Re: [PATCH v2 1/3] branch: fix die_if_checked_out() when ignore_current_worktree
From: Phillip Wood @ 2023-01-24 10:35 UTC (permalink / raw)
  To: Rubén Justo, Junio C Hamano; +Cc: Git List, Phillip Wood
In-Reply-To: <d61a2393-64c8-da49-fe13-00bc4a52d5e3@gmail.com>

Hi Rubén

On 22/01/2023 23:21, Rubén Justo wrote:
> I tried to maintain the relationship and the role, too.  Just introduce
> the helper, as Phillip suggested and I think it is a good idea.

When I suggested adding a helper I was thinking of something like

static const struct worktree *do_find_shared_symref(struct worktree 
**worktrees,
  					  const char *symref,
  					  const char *target,
					  int ignore_current)
{
	/*
	 * Body moved from find_share_symref() with a couple
	 * of lines added to support ignore_current
	 /*
}

const struct worktree *find_shared_symref(struct worktree **worktrees,
  					  const char *symref,
  					  const char *target)
{
	return do_find_shared_symref(worktrees, symref, target, 0)
}

void die_if_checked_out(const char *branch, int ignore_current_worktree)
{
	struct worktree **worktrees = get_worktrees();
	const struct worktree *wt;

	wt = do_find_shared_symref(worktrees, "HEAD", branch,
				   ignore_current_worktree);
	/* rest unchanged */
}

The aim was to avoid changing the public api

Best Wishes

Phillip


^ permalink raw reply

* [PATCH 0/2] gitk: handle long command-lines
From: Johannes Schindelin via GitGitGadget @ 2023-01-24 11:23 UTC (permalink / raw)
  To: git; +Cc: Paul Mackerras, Johannes Schindelin

These patches have been cooking for such a long time in Git for Windows that
you might think they turned into broth. Yummy broth, to be sure. But broth.
'Tis beyond time for the patches to make it upstream.

Johannes Schindelin (1):
  gitk: prevent overly long command lines

Nico Rieck (1):
  gitk: escape file paths before piping to git log

 gitk | 36 +++++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)


base-commit: 465f03869ae11acd04abfa1b83c67879c867410c
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1469%2Fdscho%2Fgitk-long-cmdline-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1469/dscho/gitk-long-cmdline-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1469
-- 
gitgitgadget

^ permalink raw reply

* [PATCH 1/2] gitk: prevent overly long command lines
From: Johannes Schindelin via GitGitGadget @ 2023-01-24 11:23 UTC (permalink / raw)
  To: git; +Cc: Paul Mackerras, Johannes Schindelin, Johannes Schindelin
In-Reply-To: <pull.1469.git.1674559397.gitgitgadget@gmail.com>

From: Johannes Schindelin <johannes.schindelin@gmx.de>

To avoid running into command line limitations, some of Git's commands
support the `--stdin` option.

Let's use exactly this option in the three rev-list/log invocations in
gitk that would otherwise possibly run the danger of trying to invoke a
too-long command line.

While it is easy to redirect either stdin or stdout in Tcl/Tk scripts,
what we need here is both. We need to capture the output, yet we also
need to pipe in the revs/files arguments via stdin (because stdin does
not have any limit, unlike the command line). To help this, we use the
neat Tcl feature where you can capture stdout and at the same time feed
a fixed string as stdin to the spawned process.

One non-obvious aspect about this change is that the `--stdin` option
allows to specify revs, the double-dash, and files, but *no* other
options such as `--not`. This is addressed by prefixing the "negative"
revs with `^` explicitly rather than relying on the `--not` option
(thanks for coming up with that idea, Max!).

This fixes https://github.com/git-for-windows/git/issues/1987

Analysis-and-initial-patch-by: Max Kirillov <max@max630.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 gitk | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/gitk b/gitk
index 0ae7d685904..92375ca6a2a 100755
--- a/gitk
+++ b/gitk
@@ -405,14 +405,16 @@ proc start_rev_list {view} {
         if {$revs eq {}} {
             return 0
         }
-        set args [concat $vflags($view) $revs]
+        set args $vflags($view)
     } else {
+        set revs {}
         set args $vorigargs($view)
     }
 
     if {[catch {
         set fd [open [concat | git log --no-color -z --pretty=raw $show_notes \
-                        --parents --boundary $args "--" $files] r]
+                        --parents --boundary $args --stdin \
+                        "<<[join [concat $revs "--" $files] "\\n"]"] r]
     } err]} {
         error_popup "[mc "Error executing git log:"] $err"
         return 0
@@ -554,13 +556,19 @@ proc updatecommits {} {
             set revs $newrevs
             set vposids($view) [lsort -unique [concat $oldpos $vposids($view)]]
         }
-        set args [concat $vflags($view) $revs --not $oldpos]
+        set args $vflags($view)
+        foreach r $oldpos {
+                lappend revs "^$r"
+        }
     } else {
+        set revs {}
         set args $vorigargs($view)
     }
     if {[catch {
         set fd [open [concat | git log --no-color -z --pretty=raw $show_notes \
-                        --parents --boundary $args "--" $vfilelimit($view)] r]
+                        --parents --boundary $args --stdin \
+                        "<<[join [concat $revs "--" \
+                                $vfilelimit($view)] "\\n"]"] r]
     } err]} {
         error_popup "[mc "Error executing git log:"] $err"
         return
@@ -10231,10 +10239,16 @@ proc getallcommits {} {
             foreach id $seeds {
                 lappend ids "^$id"
             }
+            lappend ids "--"
         }
     }
     if {$ids ne {}} {
-        set fd [open [concat $cmd $ids] r]
+        if {$ids eq "--all"} {
+            set cmd [concat $cmd "--all"]
+        } else {
+            set cmd [concat $cmd --stdin "<<[join $ids "\\n"]"]
+        }
+        set fd [open $cmd r]
         fconfigure $fd -blocking 0
         incr allcommits
         nowbusy allcommits
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 2/2] gitk: escape file paths before piping to git log
From: Nico Rieck via GitGitGadget @ 2023-01-24 11:23 UTC (permalink / raw)
  To: git; +Cc: Paul Mackerras, Johannes Schindelin, Nico Rieck
In-Reply-To: <pull.1469.git.1674559397.gitgitgadget@gmail.com>

From: Nico Rieck <nico.rieck@gmail.com>

We just started piping the file paths via `stdin` instead of passing
them via the command-line, to avoid running into command-line
limitations.

However, since we now pipe the file paths, we need to take care of
special characters.

This fixes https://github.com/git-for-windows/git/issues/2293

Signed-off-by: Nico Rieck <nico.rieck@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 gitk | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/gitk b/gitk
index 92375ca6a2a..df3ba2ea99b 100755
--- a/gitk
+++ b/gitk
@@ -353,6 +353,16 @@ proc parseviewrevs {view revs} {
     return $ret
 }
 
+# Escapes a list of filter paths to be passed to git log via stdin. Note that
+# paths must not be quoted.
+proc escape_filter_paths {paths} {
+	set escaped [list]
+	foreach path $paths {
+		lappend escaped [string map {\\ \\\\ "\ " "\\\ "} $path]
+	}
+	return $escaped
+}
+
 # Start off a git log process and arrange to read its output
 proc start_rev_list {view} {
     global startmsecs commitidx viewcomplete curview
@@ -414,7 +424,8 @@ proc start_rev_list {view} {
     if {[catch {
         set fd [open [concat | git log --no-color -z --pretty=raw $show_notes \
                         --parents --boundary $args --stdin \
-                        "<<[join [concat $revs "--" $files] "\\n"]"] r]
+                        "<<[join [concat $revs "--" \
+                                [escape_filter_paths $files]] "\\n"]"] r]
     } err]} {
         error_popup "[mc "Error executing git log:"] $err"
         return 0
@@ -568,7 +579,8 @@ proc updatecommits {} {
         set fd [open [concat | git log --no-color -z --pretty=raw $show_notes \
                         --parents --boundary $args --stdin \
                         "<<[join [concat $revs "--" \
-                                $vfilelimit($view)] "\\n"]"] r]
+                                [escape_filter_paths \
+                                        $vfilelimit($view)]] "\\n"]"] r]
     } err]} {
         error_popup "[mc "Error executing git log:"] $err"
         return
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH v2 01/10] bundle: optionally skip reachability walk
From: Derrick Stolee @ 2023-01-24 12:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, me, vdye, avarab, steadmon,
	chooglen
In-Reply-To: <xmqqtu0glw81.fsf@gitster.g>

On 1/23/2023 5:30 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> A repository having some unreachable objects floating in the object
>> store is not corrupt.  As long as all the objects reachable from refs
>> are connected, that is a perfectly sane state.
>>
>> But allowing unbundling with the sanity check loosened WILL corrupt
>> it, at the moment you point some objects from the bundle with refs.
> 
> While all of the above is true, I think existing check done by
> bundle.c::verify_bundle() is stricter than necessary.  We make sure
> that the prerequiste objects exist and are reachable from the refs.
> But for the purpose of ensuring the health of the repo after the
> operation, it is also OK if the prerequisite objects exist and they
> pass connected.c::check_connected() test to reach existing refs.
> verify_bundle() that is used in unbundle() does not allow it.

Thank you for all of the detailed explanation, here and in other
messages.

I'll focus on this area today and see what I can learn and how I
can approach this problem in a different way. The current options
that I see are:

 1. Leave verify_bundle() as-is and figure out how to refresh the
    refs. (This would remain a stricter check than necessary.)

 2. Find out how to modify verify_bundle() so it can do the more
    relaxed connectivity check.

 3. Take the connectivity check that fetch uses before updating
    refs and add that check before updating refs in the bundle URI
    code.

There could also be a combination of (2) and (3), or others I have
not considered until I go poking around in the code.

I'll let you know what I find.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH v4 5/9] rebase: add coverage of other incompatible options
From: Phillip Wood @ 2023-01-24 13:16 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren via GitGitGadget, git, Derrick Stolee,
	Eric Sunshine, Martin Ågren, Phillip Wood
In-Reply-To: <83d27162-59d4-d8c0-fde3-f522630d024d@dunelm.org.uk>

>>>> Signed-off-by: Elijah Newren <newren@gmail.com>
>>>> --- a/builtin/rebase.c
>>>> +++ b/builtin/rebase.c
>>>> @@ -1224,6 +1224,26 @@ int cmd_rebase(int argc, const char **argv, 
>>>> const char *prefix)
>>>>                if (options.fork_point < 0)
>>>>                        options.fork_point = 0;
>>>>        }
>>>> +     /*
>>>> +      * The apply backend does not support 
>>>> --[no-]reapply-cherry-picks.
>>>> +      * The behavior it implements by default is equivalent to
>>>> +      * --no-reapply-cherry-picks (due to passing --cherry-picks to
>>>> +      * format-patch), but --keep-base alters the upstream such 
>>>> that no
>>>> +      * cherry-picks can be found (effectively making it act like
>>>> +      * --reapply-cherry-picks).
>>>> +      *
>>>> +      * Now, if the user does specify --[no-]reapply-cherry-picks, but
>>>> +      * does so in such a way that options.reapply_cherry_picks ==
>>>> +      * keep_base, then the behavior they get will match what they
>>>> +      * expect despite options.reapply_cherry_picks being ignored.  We
>>>> +      * could just allow the flag in that case, but it seems better to
>>>> +      * just alert the user that they've specified a flag that the
>>>> +      * backend ignores.
>>>> +      */
>>>
>>> I'm a bit confused by this. --keep-base works with either
>>> --reapply-cherry-picks (which is the default if --keep-base is given) or
>>> --no-reapply-cherry-picks. Just below this hunk we have
>>>
>>>          if (options.reapply_cherry_picks < 0)
>>>                  options.reapply_cherry_picks = keep_base;
>>>
>>> So we only set options.reapply_cherry_picks to match keep_base if the
>>> user did not specify -[-no]-reapply-cherry-picks on the commandline.
>>
>> options.reapply_cherry_picks is totally ignored by the apply backend,
>> regardless of whether it's set by the user or the setup code in
>> builtin/rebase.c.  And if we have an option which is ignored, isn't it
>> nicer to provide an error message to the user if they tried to set it?
>>
>> Said another way, while users could start with these command lines:
>>
>>      (Y) git rebase --whitespace=fix
>>      (Z) git rebase --whitespace=fix --keep-base
>>
>> and modify them to include flags that would be ignored, we could allow:
>>
>>      (A) git rebase --whitespace=fix --no-reapply-cherry-picks
>>      (B) git rebase --whitespace=fix --keep-base --reapply-cherry-picks
>>
>> But we could not allow commands like
>>
>>      (C) git rebase --whitespace=fix --reapply-cherry-picks
>>      (D) git rebase --whitespace=fix --keep-base 
>> --no-reapply-cherry-picks
> 
> (C) is already an error
> (D) is currently allowed and I think works as expected (--keep-base only 
> implies --reapply-cherry-picks, the user is free to override that with 
> --no-reapply-cherry-picks) so I don't see why we'd want to make it an 
> error.
> 
>> For all four cases (A)-(D), the apply backend will ignore whatever
>> --[no-]reapply-cherry-picks flag is provided.
> 
> For (D) the flag is respected, (C) errors out, the other cases 
> correspond to the default so it's like saying
> 
>      git rebase --merge --no-reapply-cherry-picks
> 
> ignores the flag.

On reflection that is only true for (B). I agree that we should error 
out on (A) which we don't at the moment.

I'd support a change that errors out on (A) and (C) but continues to 
allow (B) and (D). I think we can do that with the diff below

Best Wishes

Phillip

diff --git a/builtin/rebase.c b/builtin/rebase.c
index 1481c5b6a5..66aef356b8 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -1230,12 +1230,6 @@ int cmd_rebase(int argc, const char **argv, const 
char *prefix)
                  if (options.fork_point < 0)
                          options.fork_point = 0;
          }
-        /*
-         * --keep-base defaults to --reapply-cherry-picks to avoid losing
-         * commits when using this option.
-         */
-        if (options.reapply_cherry_picks < 0)
-                options.reapply_cherry_picks = keep_base;

          if (options.root && options.fork_point > 0)
                  die(_("options '%s' and '%s' cannot be used 
together"), "--root", "--fork-point");
@@ -1412,11 +1406,17 @@ int cmd_rebase(int argc, const char **argv, 
const char *prefix)
          if (options.empty != EMPTY_UNSPECIFIED)
                  imply_merge(&options, "--empty");

-        /*
-         * --keep-base implements --reapply-cherry-picks by altering 
upstream so
-         * it works with both backends.
-         */
-        if (options.reapply_cherry_picks && !keep_base)
+        if (options.reapply_cherry_picks < 0)
+                /*
+                 * --keep-base defaults to --reapply-cherry-picks to
+                 * avoid losing commits when using this option.
+                 */
+                options.reapply_cherry_picks = keep_base;
+        else if (!keep_base)
+                /*
+                 * --keep-base implements --reapply-cherry-picks by
+                 * altering upstream so it works with both backends.
+                 */
                  imply_merge(&options, "--reapply-cherry-picks");

          if (gpg_sign)

^ permalink raw reply related

* [PATCH v2.5 01/11] bundle: test unbundling with incomplete history
From: Derrick Stolee @ 2023-01-24 14:14 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, me, vdye, avarab, steadmon,
	chooglen
In-Reply-To: <771a2993-85bd-0831-0977-24204f84e206@github.com>

On 1/24/2023 7:27 AM, Derrick Stolee wrote:

> I'll focus on this area today and see what I can learn and how I
> can approach this problem in a different way.

The first thing I did was try to figure out how things work today,
so I created this test case. It appears we were not testing this
at all previously.

This is just a candidate replacement for v3, so don't worry about
applying it until I re-roll.

Thanks,
-Stolee

--- >8 ---

From f9b0cc872ac44892fe6b1c973f16b35edfdc5b20 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Tue, 24 Jan 2023 08:47:19 -0500
Subject: [PATCH v2.5 01/11] bundle: test unbundling with incomplete history

When verifying a bundle, Git checks first that all prerequisite commits
exist in the object store, then adds an additional check: those
prerequisite commits must be reachable from references in the
repository.

This check is stronger than what is checked for refs being added during
'git fetch', which simply guarantees that the new refs have a complete
history up to the point where it intersects with the current reachable
history.

However, we also do not have any tests that check the behavior under
this condition. Create a test that demonstrates its behavior.

In order to construct a broken history, perform a shallow clone of a
repository with a linear history, but whose default branch ('base') has
a single commit, so dropping the shallow markers leaves a complete
history from that reference. However, the 'tip' reference adds a
shallow commit whose parent is missing in the cloned repository. Trying
to unbundle a bundle with the 'tip' as a prerequisite will succeed past
the object store check and move into the reachability check.

The two errors that are reported are of this form:

  error: Could not read <missing-commit>
  fatal: Failed to traverse parents of commit <present-commit>

These messages are not particularly helpful for the person running the
unbundle command, but they do prevent the command from succeeding.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 t/t6020-bundle-misc.sh | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
index 3a1cf30b1d7..38dbbf89155 100755
--- a/t/t6020-bundle-misc.sh
+++ b/t/t6020-bundle-misc.sh
@@ -566,4 +566,44 @@ test_expect_success 'cloning from filtered bundle has useful error' '
 	grep "cannot clone from filtered bundle" err
 '

+test_expect_success 'verify catches unreachable, broken prerequisites' '
+	test_when_finished rm -rf clone-from clone-to &&
+	git init clone-from &&
+	(
+		cd clone-from &&
+		git checkout -b base &&
+		test_commit A &&
+		git checkout -b tip &&
+		git commit --allow-empty -m "will drop by shallow" &&
+		git commit --allow-empty -m "will keep by shallow" &&
+		git commit --allow-empty -m "for bundle, not clone" &&
+		git bundle create tip.bundle tip~1..tip &&
+		git reset --hard HEAD~1 &&
+		git checkout base
+	) &&
+	BAD_OID=$(git -C clone-from rev-parse tip~1) &&
+	TIP_OID=$(git -C clone-from rev-parse tip) &&
+	git clone --depth=1 --no-single-branch \
+		"file://$(pwd)/clone-from" clone-to &&
+	(
+		cd clone-to &&
+
+		# Set up broken history by removing shallow markers
+		git update-ref -d refs/remotes/origin/tip &&
+		rm .git/shallow &&
+
+		# Verify should fail
+		test_must_fail git bundle verify \
+			../clone-from/tip.bundle 2>err &&
+		grep "Could not read $BAD_OID" err &&
+		grep "Failed to traverse parents of commit $TIP_OID" err &&
+
+		# Unbundling should fail
+		test_must_fail git bundle unbundle \
+			../clone-from/tip.bundle 2>err &&
+		grep "Could not read $BAD_OID" err &&
+		grep "Failed to traverse parents of commit $TIP_OID" err
+	)
+'
+
 test_done
--
2.39.1.vfs.0.0



^ permalink raw reply related

* [PATCH v2.5 02/11] bundle: verify using connected()
From: Derrick Stolee @ 2023-01-24 14:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Derrick Stolee via GitGitGadget, git, me, vdye, avarab, steadmon,
	chooglen
In-Reply-To: <771a2993-85bd-0831-0977-24204f84e206@github.com>

On 1/24/2023 7:27 AM, Derrick Stolee wrote:

> I'll focus on this area today and see what I can learn and how I
> can approach this problem in a different way.

>  2. Find out how to modify verify_bundle() so it can do the more
>     relaxed connectivity check.

And here is the modification to verify_bundle() to depend on
check_connected() instead. This also improves (in my opinion) the
error reporting from this situation, as seen in the test edits.

Again, this is a placeholder before I re-roll this series into
an inevitable v3, so don't bother applying this patch until then.

Thanks,
-Stolee

--- >8 ---

From 6a3d64761e9691994f9310b9ce2338f49aa72d48 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Tue, 24 Jan 2023 08:47:00 -0500
Subject: [PATCH v2.5 02/11] bundle: verify using connected()

When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is usually a sufficient filter, and those missing commits are indicated
clearly in the error messages. However, if the commits are present in
the object store, then there could still be issues if those commits are
not reachable from the repository's references. The repository only has
guarantees that its object store is closed under reachability for the
objects that are reachable from references.

Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().

This check is more strict than what Git applies even to fetched
pack-files. In the fetch case, Git guarantees that the new references
are closed under reachability by walking from the new references until
walking commits that are reachable from repository refs. This is done
through the well-used check_connected() method.

To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.

The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:

  error: Could not read <missing-commit>
  fatal: Failed to traverse parents of commit <extant-commit>

However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:

  error: some prerequisite commits exist in the object store,
         but are not connected to the repository's history

(Line break added here for the commit message formatting, only.)

While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle.c               | 75 ++++++++++++++++--------------------------
 t/t6020-bundle-misc.sh |  8 ++---
 2 files changed, 33 insertions(+), 50 deletions(-)

diff --git a/bundle.c b/bundle.c
index 4ef7256aa11..76c3a904898 100644
--- a/bundle.c
+++ b/bundle.c
@@ -12,6 +12,7 @@
 #include "refs.h"
 #include "strvec.h"
 #include "list-objects-filter-options.h"
+#include "connected.h"

 static const char v2_bundle_signature[] = "# v2 git bundle\n";
 static const char v3_bundle_signature[] = "# v3 git bundle\n";
@@ -187,6 +188,21 @@ static int list_refs(struct string_list *r, int argc, const char **argv)
 /* Remember to update object flag allocation in object.h */
 #define PREREQ_MARK (1u<<16)

+struct string_list_iterator {
+	struct string_list *list;
+	size_t cur;
+};
+
+static const struct object_id *iterate_ref_map(void *cb_data)
+{
+	struct string_list_iterator *iter = cb_data;
+
+	if (iter->cur >= iter->list->nr)
+		return NULL;
+
+	return iter->list->items[iter->cur++].util;
+}
+
 int verify_bundle(struct repository *r,
 		  struct bundle_header *header,
 		  enum verify_bundle_flags flags)
@@ -196,26 +212,25 @@ int verify_bundle(struct repository *r,
 	 * to be verbose about the errors
 	 */
 	struct string_list *p = &header->prerequisites;
-	struct rev_info revs = REV_INFO_INIT;
-	const char *argv[] = {NULL, "--all", NULL};
-	struct commit *commit;
-	int i, ret = 0, req_nr;
+	int i, ret = 0;
 	const char *message = _("Repository lacks these prerequisite commits:");
+	struct string_list_iterator iter = {
+		.list = p,
+	};
+	struct check_connected_options opts = {
+		.quiet = 1,
+	};

 	if (!r || !r->objects || !r->objects->odb)
 		return error(_("need a repository to verify a bundle"));

-	repo_init_revisions(r, &revs, NULL);
 	for (i = 0; i < p->nr; i++) {
 		struct string_list_item *e = p->items + i;
 		const char *name = e->string;
 		struct object_id *oid = e->util;
 		struct object *o = parse_object(r, oid);
-		if (o) {
-			o->flags |= PREREQ_MARK;
-			add_pending_object(&revs, o, name);
+		if (o)
 			continue;
-		}
 		ret++;
 		if (flags & VERIFY_BUNDLE_QUIET)
 			continue;
@@ -223,37 +238,14 @@ int verify_bundle(struct repository *r,
 			error("%s", message);
 		error("%s %s", oid_to_hex(oid), name);
 	}
-	if (revs.pending.nr != p->nr)
+	if (ret)
 		goto cleanup;
-	req_nr = revs.pending.nr;
-	setup_revisions(2, argv, &revs, NULL);
-
-	list_objects_filter_copy(&revs.filter, &header->filter);
-
-	if (prepare_revision_walk(&revs))
-		die(_("revision walk setup failed"));

-	i = req_nr;
-	while (i && (commit = get_revision(&revs)))
-		if (commit->object.flags & PREREQ_MARK)
-			i--;
-
-	for (i = 0; i < p->nr; i++) {
-		struct string_list_item *e = p->items + i;
-		const char *name = e->string;
-		const struct object_id *oid = e->util;
-		struct object *o = parse_object(r, oid);
-		assert(o); /* otherwise we'd have returned early */
-		if (o->flags & SHOWN)
-			continue;
-		ret++;
-		if (flags & VERIFY_BUNDLE_QUIET)
-			continue;
-		if (ret == 1)
-			error("%s", message);
-		error("%s %s", oid_to_hex(oid), name);
-	}
+	if ((ret = check_connected(iterate_ref_map, &iter, &opts)))
+		error(_("some prerequisite commits exist in the object store, "
+			"but are not connected to the repository's history"));

+	/* TODO: preserve this verbose language. */
 	if (flags & VERIFY_BUNDLE_VERBOSE) {
 		struct string_list *r;

@@ -282,15 +274,6 @@ int verify_bundle(struct repository *r,
 				  list_objects_filter_spec(&header->filter));
 	}
 cleanup:
-	/* Clean up objects used, as they will be reused. */
-	for (i = 0; i < p->nr; i++) {
-		struct string_list_item *e = p->items + i;
-		struct object_id *oid = e->util;
-		commit = lookup_commit_reference_gently(r, oid, 1);
-		if (commit)
-			clear_commit_marks(commit, ALL_REV_FLAGS | PREREQ_MARK);
-	}
-	release_revisions(&revs);
 	return ret;
 }

diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
index 38dbbf89155..7d40994991e 100755
--- a/t/t6020-bundle-misc.sh
+++ b/t/t6020-bundle-misc.sh
@@ -595,14 +595,14 @@ test_expect_success 'verify catches unreachable, broken prerequisites' '
 		# Verify should fail
 		test_must_fail git bundle verify \
 			../clone-from/tip.bundle 2>err &&
-		grep "Could not read $BAD_OID" err &&
-		grep "Failed to traverse parents of commit $TIP_OID" err &&
+		grep "some prerequisite commits .* are not connected" err &&
+		test_line_count = 1 err &&

 		# Unbundling should fail
 		test_must_fail git bundle unbundle \
 			../clone-from/tip.bundle 2>err &&
-		grep "Could not read $BAD_OID" err &&
-		grep "Failed to traverse parents of commit $TIP_OID" err
+		grep "some prerequisite commits .* are not connected" err &&
+		test_line_count = 1 err
 	)
 '

--
2.39.1.vfs.0.0



^ permalink raw reply related

* Re: [PATCH v3 3/8] rebase & sequencer API: fix get_replay_opts() leak in "rebase"
From: Phillip Wood @ 2023-01-24 14:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Johannes Schindelin, Junio C Hamano, Taylor Blau,
	René Scharfe
In-Reply-To: <patch-v3-3.8-e4a96898a68-20230118T160600Z-avarab@gmail.com>

Hi Ævar

On 18/01/2023 16:09, Ævar Arnfjörð Bjarmason wrote:
> Make the recently added replay_opts_release() function non-static and
> use it for freeing the "struct replay_opts" constructed by the
> get_replay_opts() function in "builtin/rebase.c". See [1] for the
> initial addition of get_replay_opts().

This paragraph only talks about rebase but we're also modifying 
cherry-pick and revert here. I'm not sure the commit reference really 
adds anything to the discussion.

> To safely call our new replay_opts_release() we'll need to stop
> calling it in sequencer_remove_state(), and instead call it where we
> allocate the "struct replay_opts" itself.

This paragraph is good, I'm not sure we really need most of the next 
three. It is much cleaner for the owner of a struct to free it and 
that's what we should be doing wherever possible.

> This is because in e.g. do_interactive_rebase() we construct a "struct
> replay_opts" with "get_replay_opts()", and then call
> "complete_action()". If we get far enough in that function without
> encountering errors we'll call "pick_commits()" which (indirectly)
> calls sequencer_remove_state() at the end.
> 
> But if we encounter errors anywhere along the way we'd punt out early,
> and not free() the memory we allocated. Remembering whether we
> previously called sequencer_remove_state() would be a hassle.
> 
> Using a FREE_AND_NULL() pattern would also work, as it would be safe
> replay_opts_release() repeatedly, but let's fix this properly instead,
> by having the owner of the data free() it.
> > See [2] for the initial implementation of "sequencer_remove_state()",
> which assumed that it should be removing the full (including on-disk)
> rebase state as a one-off.

That commit has nothing to do with rebase, it is cleaning up after 
cherry-pick/revert. rebase was still scripted back then.

The code changes look good

Best Wishes

Phillip

> 1. 73fdc535d26 (rebase -i: use struct rebase_options to parse args,
>     2019-04-17)
> 2. 26ae337be11 (revert: Introduce --reset to remove sequencer state,
>     2011-08-04)
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   builtin/rebase.c                       | 4 ++++
>   builtin/revert.c                       | 2 ++
>   sequencer.c                            | 4 +---
>   sequencer.h                            | 1 +
>   t/t3405-rebase-malformed.sh            | 1 +
>   t/t3412-rebase-root.sh                 | 1 +
>   t/t3419-rebase-patch-id.sh             | 1 +
>   t/t3423-rebase-reword.sh               | 1 +
>   t/t3425-rebase-topology-merges.sh      | 2 ++
>   t/t3437-rebase-fixup-options.sh        | 1 +
>   t/t3438-rebase-broken-files.sh         | 2 ++
>   t/t3501-revert-cherry-pick.sh          | 1 +
>   t/t3502-cherry-pick-merge.sh           | 1 +
>   t/t3503-cherry-pick-root.sh            | 1 +
>   t/t3506-cherry-pick-ff.sh              | 1 +
>   t/t3511-cherry-pick-x.sh               | 1 +
>   t/t7402-submodule-rebase.sh            | 1 +
>   t/t9106-git-svn-commit-diff-clobber.sh | 1 -
>   t/t9164-git-svn-dcommit-concurrent.sh  | 1 -
>   19 files changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 7141fd5e0c1..5859a5387d8 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -301,6 +301,7 @@ static int do_interactive_rebase(struct rebase_options *opts, unsigned flags)
>   	}
>   
>   cleanup:
> +	replay_opts_release(&replay);
>   	string_list_clear(&commands, 0);
>   	free(revisions);
>   	free(shortrevisions);
> @@ -343,6 +344,7 @@ static int run_sequencer_rebase(struct rebase_options *opts)
>   		struct replay_opts replay_opts = get_replay_opts(opts);
>   
>   		ret = sequencer_continue(the_repository, &replay_opts);
> +		replay_opts_release(&replay_opts);
>   		break;
>   	}
>   	case ACTION_EDIT_TODO:
> @@ -558,6 +560,7 @@ static int finish_rebase(struct rebase_options *opts)
>   
>   		replay.action = REPLAY_INTERACTIVE_REBASE;
>   		ret = sequencer_remove_state(&replay);
> +		replay_opts_release(&replay);
>   	} else {
>   		strbuf_addstr(&dir, opts->state_dir);
>   		if (remove_dir_recursively(&dir, 0))
> @@ -1331,6 +1334,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   
>   			replay.action = REPLAY_INTERACTIVE_REBASE;
>   			ret = sequencer_remove_state(&replay);
> +			replay_opts_release(&replay);
>   		} else {
>   			strbuf_reset(&buf);
>   			strbuf_addstr(&buf, options.state_dir);
> diff --git a/builtin/revert.c b/builtin/revert.c
> index f2d86d2a8f9..1cab16bf3ed 100644
> --- a/builtin/revert.c
> +++ b/builtin/revert.c
> @@ -251,6 +251,7 @@ int cmd_revert(int argc, const char **argv, const char *prefix)
>   	if (opts.revs)
>   		release_revisions(opts.revs);
>   	free(opts.revs);
> +	replay_opts_release(&opts);
>   	return res;
>   }
>   
> @@ -267,5 +268,6 @@ int cmd_cherry_pick(int argc, const char **argv, const char *prefix)
>   	free(opts.revs);
>   	if (res < 0)
>   		die(_("cherry-pick failed"));
> +	replay_opts_release(&opts);
>   	return res;
>   }
> diff --git a/sequencer.c b/sequencer.c
> index d385bea2bed..8ff29262c1e 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -351,7 +351,7 @@ static const char *gpg_sign_opt_quoted(struct replay_opts *opts)
>   	return buf.buf;
>   }
>   
> -static void replay_opts_release(struct replay_opts *opts)
> +void replay_opts_release(struct replay_opts *opts)
>   {
>   	free(opts->gpg_sign);
>   	free(opts->reflog_action);
> @@ -385,8 +385,6 @@ int sequencer_remove_state(struct replay_opts *opts)
>   		}
>   	}
>   
> -	replay_opts_release(opts);
> -
>   	strbuf_reset(&buf);
>   	strbuf_addstr(&buf, get_dir(opts));
>   	if (remove_dir_recursively(&buf, 0))
> diff --git a/sequencer.h b/sequencer.h
> index 888c18aad71..3bcdfa1b586 100644
> --- a/sequencer.h
> +++ b/sequencer.h
> @@ -158,6 +158,7 @@ int sequencer_pick_revisions(struct repository *repo,
>   int sequencer_continue(struct repository *repo, struct replay_opts *opts);
>   int sequencer_rollback(struct repository *repo, struct replay_opts *opts);
>   int sequencer_skip(struct repository *repo, struct replay_opts *opts);
> +void replay_opts_release(struct replay_opts *opts);
>   int sequencer_remove_state(struct replay_opts *opts);
>   
>   #define TODO_LIST_KEEP_EMPTY (1U << 0)
> diff --git a/t/t3405-rebase-malformed.sh b/t/t3405-rebase-malformed.sh
> index 25243318618..8979bc34073 100755
> --- a/t/t3405-rebase-malformed.sh
> +++ b/t/t3405-rebase-malformed.sh
> @@ -5,6 +5,7 @@ test_description='rebase should handle arbitrary git message'
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   . "$TEST_DIRECTORY"/lib-rebase.sh
>   
> diff --git a/t/t3412-rebase-root.sh b/t/t3412-rebase-root.sh
> index 58371d8a547..e75b3d0e07c 100755
> --- a/t/t3412-rebase-root.sh
> +++ b/t/t3412-rebase-root.sh
> @@ -7,6 +7,7 @@ Tests if git rebase --root --onto <newparent> can rebase the root commit.
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   log_with_names () {
> diff --git a/t/t3419-rebase-patch-id.sh b/t/t3419-rebase-patch-id.sh
> index 7181f176b81..6c61f240cf9 100755
> --- a/t/t3419-rebase-patch-id.sh
> +++ b/t/t3419-rebase-patch-id.sh
> @@ -5,6 +5,7 @@ test_description='git rebase - test patch id computation'
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   scramble () {
> diff --git a/t/t3423-rebase-reword.sh b/t/t3423-rebase-reword.sh
> index 4859bb8f722..2fab703d615 100755
> --- a/t/t3423-rebase-reword.sh
> +++ b/t/t3423-rebase-reword.sh
> @@ -2,6 +2,7 @@
>   
>   test_description='git rebase interactive with rewording'
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   . "$TEST_DIRECTORY"/lib-rebase.sh
> diff --git a/t/t3425-rebase-topology-merges.sh b/t/t3425-rebase-topology-merges.sh
> index 63acc1ea4da..a16428bdf54 100755
> --- a/t/t3425-rebase-topology-merges.sh
> +++ b/t/t3425-rebase-topology-merges.sh
> @@ -1,6 +1,8 @@
>   #!/bin/sh
>   
>   test_description='rebase topology tests with merges'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   . "$TEST_DIRECTORY"/lib-rebase.sh
>   
> diff --git a/t/t3437-rebase-fixup-options.sh b/t/t3437-rebase-fixup-options.sh
> index c023fefd681..274699dadb8 100755
> --- a/t/t3437-rebase-fixup-options.sh
> +++ b/t/t3437-rebase-fixup-options.sh
> @@ -14,6 +14,7 @@ to the "fixup" command that works with "fixup!", "fixup -C" works with
>   "amend!" upon --autosquash.
>   '
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   . "$TEST_DIRECTORY"/lib-rebase.sh
> diff --git a/t/t3438-rebase-broken-files.sh b/t/t3438-rebase-broken-files.sh
> index b92a3ce46b8..c614c4f2e4b 100755
> --- a/t/t3438-rebase-broken-files.sh
> +++ b/t/t3438-rebase-broken-files.sh
> @@ -1,6 +1,8 @@
>   #!/bin/sh
>   
>   test_description='rebase behavior when on-disk files are broken'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   test_expect_success 'set up conflicting branches' '
> diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
> index 1f4cfc37449..2f3e3e24169 100755
> --- a/t/t3501-revert-cherry-pick.sh
> +++ b/t/t3501-revert-cherry-pick.sh
> @@ -13,6 +13,7 @@ test_description='test cherry-pick and revert with renames
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   test_expect_success setup '
> diff --git a/t/t3502-cherry-pick-merge.sh b/t/t3502-cherry-pick-merge.sh
> index 5495eacfec1..1b2c0d6aca6 100755
> --- a/t/t3502-cherry-pick-merge.sh
> +++ b/t/t3502-cherry-pick-merge.sh
> @@ -11,6 +11,7 @@ test_description='cherry picking and reverting a merge
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   test_expect_success setup '
> diff --git a/t/t3503-cherry-pick-root.sh b/t/t3503-cherry-pick-root.sh
> index 95fe4feaeee..76d393dc8a3 100755
> --- a/t/t3503-cherry-pick-root.sh
> +++ b/t/t3503-cherry-pick-root.sh
> @@ -5,6 +5,7 @@ test_description='test cherry-picking (and reverting) a root commit'
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   test_expect_success setup '
> diff --git a/t/t3506-cherry-pick-ff.sh b/t/t3506-cherry-pick-ff.sh
> index 7e11bd4a4c5..b71bad17b85 100755
> --- a/t/t3506-cherry-pick-ff.sh
> +++ b/t/t3506-cherry-pick-ff.sh
> @@ -5,6 +5,7 @@ test_description='test cherry-picking with --ff option'
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   test_expect_success setup '
> diff --git a/t/t3511-cherry-pick-x.sh b/t/t3511-cherry-pick-x.sh
> index 84a587daf3a..dd5d92ef302 100755
> --- a/t/t3511-cherry-pick-x.sh
> +++ b/t/t3511-cherry-pick-x.sh
> @@ -2,6 +2,7 @@
>   
>   test_description='Test cherry-pick -x and -s'
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   pristine_detach () {
> diff --git a/t/t7402-submodule-rebase.sh b/t/t7402-submodule-rebase.sh
> index ebeca12a711..b19792b3269 100755
> --- a/t/t7402-submodule-rebase.sh
> +++ b/t/t7402-submodule-rebase.sh
> @@ -5,6 +5,7 @@
>   
>   test_description='Test rebasing, stashing, etc. with submodules'
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   
>   test_expect_success setup '
> diff --git a/t/t9106-git-svn-commit-diff-clobber.sh b/t/t9106-git-svn-commit-diff-clobber.sh
> index 3cab0b9720a..bca496c40e0 100755
> --- a/t/t9106-git-svn-commit-diff-clobber.sh
> +++ b/t/t9106-git-svn-commit-diff-clobber.sh
> @@ -3,7 +3,6 @@
>   # Copyright (c) 2006 Eric Wong
>   test_description='git svn commit-diff clobber'
>   
> -TEST_FAILS_SANITIZE_LEAK=true
>   . ./lib-git-svn.sh
>   
>   test_expect_success 'initialize repo' '
> diff --git a/t/t9164-git-svn-dcommit-concurrent.sh b/t/t9164-git-svn-dcommit-concurrent.sh
> index 1465156072e..c8e6c0733f4 100755
> --- a/t/t9164-git-svn-dcommit-concurrent.sh
> +++ b/t/t9164-git-svn-dcommit-concurrent.sh
> @@ -5,7 +5,6 @@
>   
>   test_description='concurrent git svn dcommit'
>   
> -TEST_FAILS_SANITIZE_LEAK=true
>   . ./lib-git-svn.sh
>   
>   

^ permalink raw reply

* Re: [PATCH v3 5/8] builtin/rebase.c: fix "options.onto_name" leak
From: Phillip Wood @ 2023-01-24 14:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Johannes Schindelin, Junio C Hamano, Taylor Blau,
	René Scharfe
In-Reply-To: <patch-v3-5.8-3d5c3152f69-20230118T160600Z-avarab@gmail.com>

Hi Ævar

On 18/01/2023 16:09, Ævar Arnfjörð Bjarmason wrote:
> Similar to the existing "squash_onto_name" added in [1] we need to
> free() the xstrdup()'d "options.onto.name" added for "--keep-base" in
> [2]..

Apart from the unnecessary free() I was quite happy with the previous 
implementation that renamed squash_onto_name and used that but this 
patch looks fine.

Best Wishes

Phillip

> 1. 9dba809a69a (builtin rebase: support --root, 2018-09-04)
> 2. 414d924beb4 (rebase: teach rebase --keep-base, 2019-08-27)
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   builtin/rebase.c                 | 4 +++-
>   t/t3416-rebase-onto-threedots.sh | 1 +
>   2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 5859a5387d8..5c474fb6edd 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -1037,6 +1037,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   	struct string_list strategy_options = STRING_LIST_INIT_NODUP;
>   	struct object_id squash_onto;
>   	char *squash_onto_name = NULL;
> +	char *keep_base_onto_name = NULL;
>   	int reschedule_failed_exec = -1;
>   	int allow_preemptive_ff = 1;
>   	int preserve_merges_selected = 0;
> @@ -1660,7 +1661,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   		strbuf_addstr(&buf, options.upstream_name);
>   		strbuf_addstr(&buf, "...");
>   		strbuf_addstr(&buf, branch_name);
> -		options.onto_name = xstrdup(buf.buf);
> +		options.onto_name = keep_base_onto_name = xstrdup(buf.buf);
>   	} else if (!options.onto_name)
>   		options.onto_name = options.upstream_name;
>   	if (strstr(options.onto_name, "...")) {
> @@ -1836,6 +1837,7 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>   	free(options.strategy);
>   	strbuf_release(&options.git_format_patch_opt);
>   	free(squash_onto_name);
> +	free(keep_base_onto_name);
>   	string_list_clear(&exec, 0);
>   	string_list_clear(&strategy_options, 0);
>   	return !!ret;
> diff --git a/t/t3416-rebase-onto-threedots.sh b/t/t3416-rebase-onto-threedots.sh
> index ea501f2b42b..f8c4ed78c9e 100755
> --- a/t/t3416-rebase-onto-threedots.sh
> +++ b/t/t3416-rebase-onto-threedots.sh
> @@ -5,6 +5,7 @@ test_description='git rebase --onto A...B'
>   GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>   
> +TEST_PASSES_SANITIZE_LEAK=true
>   . ./test-lib.sh
>   . "$TEST_DIRECTORY/lib-rebase.sh"
>   

^ permalink raw reply

* Re: [PATCH v3 0/8] sequencer API & users: fix widespread leaks
From: Phillip Wood @ 2023-01-24 14:41 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Johannes Schindelin, Junio C Hamano, Taylor Blau,
	René Scharfe
In-Reply-To: <cover-v3-0.8-00000000000-20230118T160600Z-avarab@gmail.com>

Hi Ævar

On 18/01/2023 16:09, Ævar Arnfjörð Bjarmason wrote:
> This series fixes various widespread leaks in the sequencer and its
> users (rebase, revert, cherry-pick). As a result 18 tests become
> leak-free in their entirety.
> 
> See the v1 for a longer general summary:
> https://lore.kernel.org/git/cover-00.10-00000000000-20221230T071741Z-avarab@gmail.com/
> 
> Changes since v2:
> 
>   * Reword/amend commit messages for some of the functional changes in
>     v1..v2.
>   * Remove leftover "opts->xopts_nr = 0" now that we don't
>     FREE_AND_NULL() anymore.
>   * Drop the "squash_onto_name" refactoring to a "to_free"
>   * Instead add a new "keep_base_onto_name" in the next commit. I'd
>     missed that if both options were provided we'd die(), so that
>     free() wasn't needed before re-assignment, as Phillip pointed
>     out...

This version looks good. The commit message for patch 3 is a bit 
rambling and the commit citation is not entirely accurate but I'm not 
sure it is worth a re-roll just for that.

Thanks for working on this

Phillip


> CI & branch for this available at:
> https://github.com/avar/git/tree/avar/leak-fixes-sequencer-rebase-3
> 
> Ævar Arnfjörð Bjarmason (8):
>    rebase: use "cleanup" pattern in do_interactive_rebase()
>    sequencer.c: split up sequencer_remove_state()
>    rebase & sequencer API: fix get_replay_opts() leak in "rebase"
>    builtin/revert.c: move free-ing of "revs" to replay_opts_release()
>    builtin/rebase.c: fix "options.onto_name" leak
>    sequencer.c: always free() the "msgbuf" in do_pick_commit()
>    builtin/rebase.c: free() "options.strategy_opts"
>    commit.c: free() revs.commit in get_fork_point()
> 
>   builtin/rebase.c                       | 22 ++++++++------
>   builtin/revert.c                       |  8 ++---
>   commit.c                               |  1 +
>   sequencer.c                            | 42 ++++++++++++++++----------
>   sequencer.h                            |  1 +
>   t/t3405-rebase-malformed.sh            |  1 +
>   t/t3412-rebase-root.sh                 |  1 +
>   t/t3416-rebase-onto-threedots.sh       |  1 +
>   t/t3419-rebase-patch-id.sh             |  1 +
>   t/t3423-rebase-reword.sh               |  1 +
>   t/t3425-rebase-topology-merges.sh      |  2 ++
>   t/t3431-rebase-fork-point.sh           |  1 +
>   t/t3432-rebase-fast-forward.sh         |  1 +
>   t/t3437-rebase-fixup-options.sh        |  1 +
>   t/t3438-rebase-broken-files.sh         |  2 ++
>   t/t3501-revert-cherry-pick.sh          |  1 +
>   t/t3502-cherry-pick-merge.sh           |  1 +
>   t/t3503-cherry-pick-root.sh            |  1 +
>   t/t3506-cherry-pick-ff.sh              |  1 +
>   t/t3511-cherry-pick-x.sh               |  1 +
>   t/t7402-submodule-rebase.sh            |  1 +
>   t/t9106-git-svn-commit-diff-clobber.sh |  1 -
>   t/t9164-git-svn-dcommit-concurrent.sh  |  1 -
>   23 files changed, 61 insertions(+), 33 deletions(-)
> 
> Range-diff against v2:
>   1:  d0a0524f3d4 =  1:  b223429df33 rebase: use "cleanup" pattern in do_interactive_rebase()
>   2:  c4eaa8dfef4 =  2:  00c7f04363f sequencer.c: split up sequencer_remove_state()
>   3:  f06f565ceaf !  3:  e4a96898a68 rebase & sequencer API: fix get_replay_opts() leak in "rebase"
>      @@ Commit message
>           get_replay_opts() function in "builtin/rebase.c". See [1] for the
>           initial addition of get_replay_opts().
>       
>      -    To safely call our new replay_opts_release() we'll need to change all
>      -    the free() to a FREE_AND_NULL(), and set "xopts_nr" to "0" after we
>      -    loop over it and free() it (the free() in the loop doesn't need to be
>      -    a FREE_AND_NULL()).
>      +    To safely call our new replay_opts_release() we'll need to stop
>      +    calling it in sequencer_remove_state(), and instead call it where we
>      +    allocate the "struct replay_opts" itself.
>       
>           This is because in e.g. do_interactive_rebase() we construct a "struct
>           replay_opts" with "get_replay_opts()", and then call
>      @@ Commit message
>       
>           But if we encounter errors anywhere along the way we'd punt out early,
>           and not free() the memory we allocated. Remembering whether we
>      -    previously called sequencer_remove_state() would be a hassle, so let's
>      -    make it safe to re-invoke replay_opts_release() instead.
>      +    previously called sequencer_remove_state() would be a hassle.
>       
>      -    I experimented with a change to be more paranoid instead, i.e. to
>      -    exhaustively check our state via an enum. We could make sure that we:
>      -
>      -    - Only allow calling "replay_opts_release()" after
>      -      "sequencer_remove_state()", but not the other way around.
>      -
>      -    - Forbid invoking either function twice in a row.
>      -
>      -    But such paranoia isn't warranted here, let's instead take the easy
>      -    way out and FREE_AND_NULL() this.
>      +    Using a FREE_AND_NULL() pattern would also work, as it would be safe
>      +    replay_opts_release() repeatedly, but let's fix this properly instead,
>      +    by having the owner of the data free() it.
>       
>           See [2] for the initial implementation of "sequencer_remove_state()",
>           which assumed that it should be removing the full (including on-disk)
>      @@ sequencer.c: static const char *gpg_sign_opt_quoted(struct replay_opts *opts)
>        {
>        	free(opts->gpg_sign);
>        	free(opts->reflog_action);
>      -@@ sequencer.c: static void replay_opts_release(struct replay_opts *opts)
>      - 	free(opts->strategy);
>      - 	for (size_t i = 0; i < opts->xopts_nr; i++)
>      - 		free(opts->xopts[i]);
>      -+	opts->xopts_nr = 0;
>      - 	free(opts->xopts);
>      - 	strbuf_release(&opts->current_fixups);
>      - }
>       @@ sequencer.c: int sequencer_remove_state(struct replay_opts *opts)
>        		}
>        	}
>   4:  e83bdfab046 !  4:  9f72cc6e46b builtin/revert.c: move free-ing of "revs" to replay_opts_release()
>      @@ Commit message
>           rather than having these users reach into the struct to free its
>           individual members.
>       
>      -    As explained in earlier change we should be using FREE_AND_NULL() in
>      -    replay_opts_release() rather than free().
>      -
>           1. d1ec656d68f (cherry-pick: free "struct replay_opts" members,
>              2022-11-08)
>           2. fd74ac95ac3 (revert: free "struct replay_opts" members, 2022-07-01)
>      @@ builtin/revert.c: int cmd_cherry_pick(int argc, const char **argv, const char *p
>       
>        ## sequencer.c ##
>       @@ sequencer.c: void replay_opts_release(struct replay_opts *opts)
>      - 	opts->xopts_nr = 0;
>      + 		free(opts->xopts[i]);
>        	free(opts->xopts);
>        	strbuf_release(&opts->current_fixups);
>       +	if (opts->revs)
>   5:  4fea2b77c6d <  -:  ----------- builtin/rebase.c: rename "squash_onto_name" to "to_free"
>   6:  898bb7698fc !  5:  3d5c3152f69 builtin/rebase.c: fix "options.onto_name" leak
>      @@ Metadata
>        ## Commit message ##
>           builtin/rebase.c: fix "options.onto_name" leak
>       
>      -    In [1] we started saving away the earlier xstrdup()'d
>      -    "options.onto_name" assignment to free() it, but when [2] added this
>      -    "keep_base" branch it didn't free() the already assigned value before
>      -    re-assigning to "options.onto_name". Let's do that, and fix the memory
>      -    leak.
>      +    Similar to the existing "squash_onto_name" added in [1] we need to
>      +    free() the xstrdup()'d "options.onto.name" added for "--keep-base" in
>      +    [2]..
>       
>           1. 9dba809a69a (builtin rebase: support --root, 2018-09-04)
>           2. 414d924beb4 (rebase: teach rebase --keep-base, 2019-08-27)
>      @@ Commit message
>           Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>       
>        ## builtin/rebase.c ##
>      +@@ builtin/rebase.c: int cmd_rebase(int argc, const char **argv, const char *prefix)
>      + 	struct string_list strategy_options = STRING_LIST_INIT_NODUP;
>      + 	struct object_id squash_onto;
>      + 	char *squash_onto_name = NULL;
>      ++	char *keep_base_onto_name = NULL;
>      + 	int reschedule_failed_exec = -1;
>      + 	int allow_preemptive_ff = 1;
>      + 	int preserve_merges_selected = 0;
>       @@ builtin/rebase.c: int cmd_rebase(int argc, const char **argv, const char *prefix)
>        		strbuf_addstr(&buf, options.upstream_name);
>        		strbuf_addstr(&buf, "...");
>        		strbuf_addstr(&buf, branch_name);
>       -		options.onto_name = xstrdup(buf.buf);
>      -+		free(to_free);
>      -+		options.onto_name = to_free = xstrdup(buf.buf);
>      ++		options.onto_name = keep_base_onto_name = xstrdup(buf.buf);
>        	} else if (!options.onto_name)
>        		options.onto_name = options.upstream_name;
>        	if (strstr(options.onto_name, "...")) {
>      +@@ builtin/rebase.c: int cmd_rebase(int argc, const char **argv, const char *prefix)
>      + 	free(options.strategy);
>      + 	strbuf_release(&options.git_format_patch_opt);
>      + 	free(squash_onto_name);
>      ++	free(keep_base_onto_name);
>      + 	string_list_clear(&exec, 0);
>      + 	string_list_clear(&strategy_options, 0);
>      + 	return !!ret;
>       
>        ## t/t3416-rebase-onto-threedots.sh ##
>       @@ t/t3416-rebase-onto-threedots.sh: test_description='git rebase --onto A...B'
>   7:  fb38dc873f9 =  6:  c07dc006c6d sequencer.c: always free() the "msgbuf" in do_pick_commit()
>   8:  d4b0e2a5c83 !  7:  ee8262ab22a builtin/rebase.c: free() "options.strategy_opts"
>      @@ builtin/rebase.c: int cmd_rebase(int argc, const char **argv, const char *prefix
>        	free(options.strategy);
>       +	free(options.strategy_opts);
>        	strbuf_release(&options.git_format_patch_opt);
>      - 	free(to_free);
>      - 	string_list_clear(&exec, 0);
>      + 	free(squash_onto_name);
>      + 	free(keep_base_onto_name);
>   9:  fd9c7a5547c =  8:  84343ea6bf6 commit.c: free() revs.commit in get_fork_point()

^ permalink raw reply

* Re: [PATCH 4/5] sequencer: use the new hook API for the simpler "post-rewrite" call
From: Phillip Wood @ 2023-01-24 14:46 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Emily Shaffer, Junio C Hamano, Eric Sunshine, Felipe Contreras,
	Taylor Blau, Michael Strawbridge
In-Reply-To: <patch-4.5-7a55c95f60f-20230123T170551Z-avarab@gmail.com>

Hi Ævar

On 23/01/2023 17:15, Ævar Arnfjörð Bjarmason wrote:
> From: Emily Shaffer <emilyshaffer@google.com>
> 
> Change the invocation of the "post-rewrite" hook added in
> 795160457db (sequencer (rebase -i): run the post-rewrite hook, if
> needed, 2017-01-02) to use the new hook API.
> 
> This leaves the more complex "post-rewrite" invocation added in
> a87a6f3c98e (commit: move post-rewrite code to libgit, 2017-11-17)
> here in sequencer.c unconverted. That'll be done in a subsequent
> commit.

As a reader I'd find it more helpful to explain why the conversion isn't 
done here rather than leaving be to run "git show" to figure it out. If 
you re-roll perhaps we could replace the commit citation with something like

sequencer.c also contains an invocation of the "post-rewrite" hook in 
run_rewrite_hook() that is not converted as the hook API does not allow 
us to pass the hook input as a string yet.

Best Wishes

Phillip

> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   sequencer.c | 18 ++++--------------
>   1 file changed, 4 insertions(+), 14 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 3e4a1972897..d8d59d05dd4 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4834,8 +4834,7 @@ static int pick_commits(struct repository *r,
>   		if (!stat(rebase_path_rewritten_list(), &st) &&
>   				st.st_size > 0) {
>   			struct child_process child = CHILD_PROCESS_INIT;
> -			const char *post_rewrite_hook =
> -				find_hook("post-rewrite");
> +			struct run_hooks_opt hook_opt = RUN_HOOKS_OPT_INIT;
>   
>   			child.in = open(rebase_path_rewritten_list(), O_RDONLY);
>   			child.git_cmd = 1;
> @@ -4845,18 +4844,9 @@ static int pick_commits(struct repository *r,
>   			/* we don't care if this copying failed */
>   			run_command(&child);
>   
> -			if (post_rewrite_hook) {
> -				struct child_process hook = CHILD_PROCESS_INIT;
> -
> -				hook.in = open(rebase_path_rewritten_list(),
> -					O_RDONLY);
> -				hook.stdout_to_stderr = 1;
> -				hook.trace2_hook_name = "post-rewrite";
> -				strvec_push(&hook.args, post_rewrite_hook);
> -				strvec_push(&hook.args, "rebase");
> -				/* we don't care if this hook failed */
> -				run_command(&hook);
> -			}
> +			hook_opt.path_to_stdin = rebase_path_rewritten_list();
> +			strvec_push(&hook_opt.args, "rebase");
> +			run_hooks_opt("post-rewrite", &hook_opt);
>   		}
>   		apply_autostash(rebase_path_autostash());
>   

^ permalink raw reply

* Re: [PATCH v4 5/9] rebase: add coverage of other incompatible options
From: Junio C Hamano @ 2023-01-24 14:48 UTC (permalink / raw)
  To: Phillip Wood
  Cc: Elijah Newren, Elijah Newren via GitGitGadget, git,
	Derrick Stolee, Eric Sunshine, Martin Ågren
In-Reply-To: <7b9ee972-2680-2e1b-bef3-201d8a1e4bdd@dunelm.org.uk>

Phillip Wood <phillip.wood123@gmail.com> writes:

> On reflection that is only true for (B). I agree that we should error
> out on (A) which we don't at the moment.
>
> I'd support a change that errors out on (A) and (C) but continues to
> allow (B) and (D). I think we can do that with the diff below

Thanks, both of you, for well reasoned design work and respectful
communication.

^ permalink raw reply

* Re: nested submodules detection w/o .gitmodules file
From: Andry @ 2023-01-24 14:54 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git
In-Reply-To: <CAJoAoZkjfOQwkeQqzQY5qDo7Md5QWSz1pOTBQKHL5KwNu2VoDg@mail.gmail.com>

Hello Emily,



Monday, January 23, 2023, 11:12:32 PM, you wrote:

ES> There is a little nuance here. Git can have nested repositories in a few different ways; submodules are just one of them.
ES> A submodule is the combination of a gitlink object in the object graph *and* a corresponding entry in the .gitmodules file.
ES> It's certainly possible to embed a nested repository in other ways, such as by ignoring the .gitmodules file, but then your
ES> nested repository is no longer a submodule, and operations which recurse over submodules will not consider that nested repository.

But it does consider. See the output of the `git submodule summary` command:
https://github.com/gitextensions/gitextensions/discussions/10644#discussioncomment-4688533

ES> The reason is that we need to understand where to clone the submodule from - that information isn't contained in the superproject's
ES> repository URL

Yes, it does clone "externally", but after a clone it is contained in the `.git/config` of a subproject:

[submodule]
        active = .
[remote "origin"]
        url = ...

ES> , and it can't be contained in the gitlink directly (which is in essence just a commit object, but one that exists
ES> in the nested repository, not in the superproject repository). If we didn't have a way of writing down the submodule's remote URL
ES> and version controlling it, we wouldn't have a way to populate the submodules when a user is cloning.

The `git submodule summary` nevertheless can populate submodules after that, so this is not the matter.

ES> The .gitmodules exists to help at clone time; it's possible, as I
ES> think you're pointing out, to have some intermediate state locally.
ES> But this file is what needs to be the source of truth for putting
ES> together the repository on a new machine for the first time.

Yes, the original clone command is superseded in those projects.


^ permalink raw reply

* Inconsistency between git-log documentation and behavior regarding default date format.
From: Rafael Dulfer @ 2023-01-24 15:09 UTC (permalink / raw)
  To: git
In-Reply-To: <793c8116-f7ea-eef2-6979-231c3e94639a@dulfer.be>

The Git documentation 
<https://www.git-scm.com/docs/git-log#Documentation/git-log.txt---dateltformatgt> states 
that if no date format is given (or --date=default is given), the date 
format is similar to rfc2822 except of a few exceptions, of which those 
listed are:

   * There is no comma after the day-of-week
   * The time zone is omitted when the local time zone is used

However, if we were to compare the two date formats, you can see another 
difference:

git log --default  ->  Tue Jan 24 11:03:47 2023 +0100
git log --rfc          ->  Tue, 24 Jan 2023 11:03:47 +0100

With the default, the month and day-of-month are switched around. From 
my own quick investigation, this behavior occurs because of the 
statement found at date.c#L266 
<https://github.com/git/git/blob/56c8fb1e95377900ec9d53c07886022af0a5d3c2/date.c#L266> 
wherein the month is inserted before the day-of-month. I am unsure which 
behavior is exactly intended and whether this discrepancy was known, but 
it would probably be a good idea to have a note of it in the documentation.

^ permalink raw reply

* Re: [PATCH v2 01/10] bundle: optionally skip reachability walk
From: Junio C Hamano @ 2023-01-24 15:22 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, me, vdye, avarab, steadmon,
	chooglen
In-Reply-To: <771a2993-85bd-0831-0977-24204f84e206@github.com>

Derrick Stolee <derrickstolee@github.com> writes:

> I'll focus on this area today and see what I can learn and how I
> can approach this problem in a different way. The current options
> that I see are:
>
>  1. Leave verify_bundle() as-is and figure out how to refresh the
>     refs. (This would remain a stricter check than necessary.)

Even if we switch to "assume everything is OK, remember a few key
facts (like prerequisites and tips) about each bundle as we unpack
them, and sanity check the results at the end" approach, doesn't
that last step require us to be able to see the final state of the
refs?  If so, wouldn't we need to figure out how to refresh the refs
no matter what?

>  2. Find out how to modify verify_bundle() so it can do the more
>     relaxed connectivity check.

I am not sure what kind of relaxing you have in mind, but as long as
we can guarantee the connectedness of the end result

>  3. Take the connectivity check that fetch uses before updating
>     refs and add that check before updating refs in the bundle URI
>     code.

This is optional at much lower priority, isn't it?  In the second
example in the message you are responding to, I do not think it is
too bad to reject the bundle based on '8' that has been rewound away
(in other words, a bundle publisher ought to be basing their bundles
on well publicized and commonly available commits).  Only when we
try to be overly helpful to such a use case, it becomes necessary to
loosen the rule from "all prerequisites must be reachable from
existing refs" to "or prerequisites that are not reachable from any
refs are also OK if they pass check_connected()".

The current check to require that prerequisites are reachable from
refs does not have to check trees and blobs, because any commit that
is reachable from an existing ref is complete[*] by definition.

    Let's define a term: a commit is "complete" iff it is not
    missing any objects that it (recursively) references to.

The check done by check_connected() is more expensive because it has
to prove that a commit, which is found in the object store and may
or may not be reachable from any refs, is complete.  The tranversal
still can take advantage of the fact that commits _reachable_ from
refs are guaranteed to be complete, but until the traversal reaches
a commit that is reachable from refs (e.g. when inspecting commits
'8' and then '7' until it reaches '6', in the second example in the
message you are responding to) we need to look at trees and blobs.

> There could also be a combination of (2) and (3), or others I have
> not considered until I go poking around in the code.
>
> I'll let you know what I find.

Thanks.  Unlike areas that allow glitches as long as workarounds are
available (e.g. UI), the object store + refs layer is where it is
absolutely required to be correct.  I am happy to see capable minds
are on it.

^ permalink raw reply

* [PATCH v2] ssh signing: better error message when key not in agent
From: Adam Szkoda via GitGitGadget @ 2023-01-24 15:26 UTC (permalink / raw)
  To: git; +Cc: Phillip Wood, Adam Szkoda, Fabian Stelzer, Adam Szkoda,
	Adam Szkoda
In-Reply-To: <pull.1270.git.git.1674029874363.gitgitgadget@gmail.com>

From: Adam Szkoda <adaszko@gmail.com>

When signing a commit with a SSH key, with the private key missing from
ssh-agent, a confusing error message is produced:

    error: Load key
    "/var/folders/t5/cscwwl_n3n1_8_5j_00x_3t40000gn/T//.git_signing_key_tmpkArSj7":
    invalid format? fatal: failed to write commit object

The temporary file .git_signing_key_tmpkArSj7 created by git contains a
valid *public* key.  The error message comes from `ssh-keygen -Y sign' and
is caused by a fallback mechanism in ssh-keygen whereby it tries to
interpret .git_signing_key_tmpkArSj7 as a *private* key if it can't find in
the agent [1].  A fix is scheduled to be released in OpenSSH 9.1. All that
needs to be done is to pass an additional backward-compatible option -U to
'ssh-keygen -Y sign' call.  With '-U', ssh-keygen always interprets the file
as public key and expects to find the private key in the agent.

As a result, when the private key is missing from the agent, a more accurate
error message gets produced:

    error: Couldn't find key in agent

[1] https://bugzilla.mindrot.org/show_bug.cgi?id=3429

Signed-off-by: Adam Szkoda <adaszko@gmail.com>
---
    ssh signing: better error message when key not in agent
    
    When signing a commit with a SSH key, with the private key missing from
    ssh-agent, a confusing error message is produced:
    
    error: Load key "/var/folders/t5/cscwwl_n3n1_8_5j_00x_3t40000gn/T//.git_signing_key_tmpkArSj7": invalid format?
    fatal: failed to write commit object
    
    
    The temporary file .git_signing_key_tmpkArSj7 created by git contains a
    valid public key. The error message comes from `ssh-keygen -Y sign' and
    is caused by a fallback mechanism in ssh-keygen whereby it tries to
    interpret .git_signing_key_tmpkArSj7 as a private key if it can't find
    in the agent [1]. A fix is scheduled to be released in OpenSSH 9.1. All
    that needs to be done is to pass an additional backward-compatible
    option -U to 'ssh-keygen -Y sign' call. With '-U', ssh-keygen always
    interprets the file as public key and expects to find the private key in
    the agent.
    
    As a result, when the private key is missing from the agent, a more
    accurate error message gets produced:
    
    error: Couldn't find key in agent
    
    
    [1] https://bugzilla.mindrot.org/show_bug.cgi?id=3429

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1270%2Fradicle-dev%2Fmaint-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1270/radicle-dev/maint-v2
Pull-Request: https://github.com/git/git/pull/1270

Range-diff vs v1:

 1:  0ce06076242 < -:  ----------- ssh signing: better error message when key not in agent
 -:  ----------- > 1:  03dfca79387 ssh signing: better error message when key not in agent


 gpg-interface.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index f877a1ea564..33899a450eb 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -998,6 +998,7 @@ static int sign_buffer_ssh(struct strbuf *buffer, struct strbuf *signature,
 	char *ssh_signing_key_file = NULL;
 	struct strbuf ssh_signature_filename = STRBUF_INIT;
 	const char *literal_key = NULL;
+	int literal_ssh_key = 0;
 
 	if (!signing_key || signing_key[0] == '\0')
 		return error(
@@ -1005,6 +1006,7 @@ static int sign_buffer_ssh(struct strbuf *buffer, struct strbuf *signature,
 
 	if (is_literal_ssh_key(signing_key, &literal_key)) {
 		/* A literal ssh key */
+		literal_ssh_key = 1;
 		key_file = mks_tempfile_t(".git_signing_key_tmpXXXXXX");
 		if (!key_file)
 			return error_errno(
@@ -1036,11 +1038,14 @@ static int sign_buffer_ssh(struct strbuf *buffer, struct strbuf *signature,
 	}
 
 	strvec_pushl(&signer.args, use_format->program,
-		     "-Y", "sign",
-		     "-n", "git",
-		     "-f", ssh_signing_key_file,
-		     buffer_file->filename.buf,
-		     NULL);
+			"-Y", "sign",
+			"-n", "git",
+			"-f", ssh_signing_key_file,
+			NULL);
+	if (literal_ssh_key) {
+		strvec_push(&signer.args, "-U");
+	}
+	strvec_push(&signer.args, buffer_file->filename.buf);
 
 	sigchain_push(SIGPIPE, SIG_IGN);
 	ret = pipe_command(&signer, NULL, 0, NULL, 0, &signer_stderr, 0);

base-commit: 844ede312b4e988881b6e27e352f469d8ab80b2a
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH v2] ls-tree: add --sparse-filter-oid argument
From: William Sprent @ 2023-01-24 15:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	William Sprent via GitGitGadget
  Cc: git, Eric Sunshine, Derrick Stolee, Victoria Dye, Elijah Newren
In-Reply-To: <230123.865ycxtn7i.gmgdl@evledraar.gmail.com>

On 23/01/2023 14.00, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Jan 23 2023, William Sprent via GitGitGadget wrote:
> 
>> From: William Sprent <williams@unity3d.com>
>>
>> There is currently no way to ask git the question "which files would be
>> part of a sparse checkout of commit X with sparse checkout patterns Y".
>> One use-case would be that tooling may want know whether sparse checkouts
>> of two commits contain the same content even if the full trees differ.
>> Another interesting use-case would be for tooling to use in conjunction
>> with 'git update-index --index-info'.
> 
> Rather than commenting on individual points, I checked this out and
> produced something squashable on-top, it fixes various issues (some
> aspects of which still remain):

Thanks. The changes all make sense to me. I'll apply them for v3 and
shorten the remaining long lines.

> 
>   * You need to wrap your code at 79 chars (and some of that could be
>     helped by picking less verbose identifiers & variable names,
>     e.g. just use "context" rather than "read_tree_context" in
>     cmd_ls_tree(), it has no other "context"...)>   * You're making the memory management aroind init_sparse_filter_data()
>     needlessly hard, just put it on the stack instead. That also allows
>     for init-ing most of it right away, or via a macro in the assignment.

Fair point. I do think that having all the initialisation of the struct
in one place makes the codes a bit more readable. But I guess it doesn't
make a big difference in this case.

>   * You have a stray refactoring of dir.c in your proposed change, this
>     changes it back, let's leave that sort of thing out.
>   * You're adding a function that returns an enum, but you declare it as
>     returning "int", let's retain that type in the header & declaration.

These were oversights. Thanks for catching them.

> 
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 46a815fc7dc..68d6ef675f2 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -372,36 +372,37 @@ struct sparse_filter_data {
>   	struct ls_tree_options *options;
>   };
>   
> -static void init_sparse_filter_data(struct sparse_filter_data **d, struct ls_tree_options *options,
> -	const char *sparse_oid_name, read_tree_fn_t fn)
> +static void init_sparse_filter_data(struct sparse_filter_data *d,
> +				    const char *sparse_oid_name)
>   {
>   	struct object_id sparse_oid;
>   	struct object_context oc;
>   
> -	(*d) = xcalloc(1, sizeof(**d));
> -	(*d)->fn = fn;
> -	(*d)->pl.use_cone_patterns = core_sparse_checkout_cone;
> -	(*d)->options = options;
> -	strbuf_init(&(*d)->full_path_buf, 0);
> -
>   	if (get_oid_with_context(the_repository, sparse_oid_name, GET_OID_BLOB,
>   			&sparse_oid, &oc))
> -		die(_("unable to access sparse blob in '%s'"), sparse_oid_name);
> -	if (add_patterns_from_blob_to_list(&sparse_oid, "", 0, &(*d)->pl) < 0)
> +		die(_("unable to access sparse blob in '%s'"),
> +		    sparse_oid_name);
> +	if (add_patterns_from_blob_to_list(&sparse_oid, "", 0, &d->pl) < 0)
>   		die(_("unable to parse sparse filter data in %s"),
>   		    oid_to_hex(&sparse_oid));
>   }
>   
> -static void free_sparse_filter_data(struct sparse_filter_data *d)
> +static void release_sparse_filter_data(struct sparse_filter_data *d)
>   {
>   	clear_pattern_list(&d->pl);
>   	strbuf_release(&d->full_path_buf);
> -	free(d);
>   }
>   
> -static int path_matches_sparse_checkout_patterns(struct strbuf *full_path, struct pattern_list *pl, int dtype)
> +static int path_matches_sparse_checkout_patterns(struct strbuf *full_path,
> +						 struct pattern_list *pl,
> +						 int dtype)
>   {
> -	enum pattern_match_result match = recursively_match_path_with_sparse_patterns(full_path->buf, the_repository->index, dtype, pl);
> +	enum pattern_match_result match;
> +
> +	match = recursively_match_path_with_sparse_patterns(full_path->buf,
> +							    the_repository->index,
> +							    dtype, pl);
> +
>   	return match > 0;
>   }
>   
> @@ -436,7 +437,10 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>   	struct ls_tree_options options = { 0 };
>   	char *sparse_oid_name = NULL;
>   	void *read_tree_context = NULL;
> -	struct sparse_filter_data *sparse_filter_data = NULL;
> +	struct sparse_filter_data sparse_filter_data = {
> +		.options = &options,
> +		.full_path_buf = STRBUF_INIT,
> +	};
>   	const struct option ls_tree_options[] = {
>   		OPT_BIT('d', NULL, &options.ls_options, N_("only show trees"),
>   			LS_TREE_ONLY),
> @@ -542,16 +546,17 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>   	}
>   
>   	if (sparse_oid_name) {
> -		init_sparse_filter_data(&sparse_filter_data, &options, sparse_oid_name, fn);
> -		read_tree_context = sparse_filter_data;
> +		sparse_filter_data.fn = fn;
> +		init_sparse_filter_data(&sparse_filter_data, sparse_oid_name);
> +		read_tree_context = &sparse_filter_data;
>   		fn = filter_sparse;
>   	} else
>   		read_tree_context = &options;
>   
> -	ret = !!read_tree(the_repository, tree, &options.pathspec, fn, read_tree_context);
> +	ret = !!read_tree(the_repository, tree, &options.pathspec, fn,
> +			  read_tree_context);
>   	clear_pathspec(&options.pathspec);
> -	if (sparse_filter_data)
> -		free_sparse_filter_data(sparse_filter_data);
> +	release_sparse_filter_data(&sparse_filter_data);
>   
>   	return ret;
>   }
> diff --git a/dir.c b/dir.c
> index 122ebced08e..57ec6404901 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1457,10 +1457,10 @@ int init_sparse_checkout_patterns(struct index_state *istate)
>   	return 0;
>   }
>   
> -int recursively_match_path_with_sparse_patterns(const char *path,
> -						struct index_state *istate,
> -						int dtype,
> -						struct pattern_list *pl)
> +enum pattern_match_result recursively_match_path_with_sparse_patterns(const char *path,
> +								      struct index_state *istate,
> +								      int dtype,
> +								      struct pattern_list *pl)
>   {
>   	enum pattern_match_result match = UNDECIDED;
>   	const char *end, *slash;
> @@ -1470,7 +1470,8 @@ int recursively_match_path_with_sparse_patterns(const char *path,
>   	 * never returns UNDECIDED, so we will execute only one iteration in
>   	 * this case.
>   	 */
> -	for (end = path + strlen(path); end > path && match == UNDECIDED;
> +	for (end = path + strlen(path);
> +	     end > path && match == UNDECIDED;
>   	     end = slash) {
>   		for (slash = end - 1; slash > path && *slash != '/'; slash--)
>   			; /* do nothing */
> diff --git a/dir.h b/dir.h
> index d186d271606..3c65e68ca40 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -403,10 +403,10 @@ int path_in_sparse_checkout(const char *path,
>   int path_in_cone_mode_sparse_checkout(const char *path,
>   				      struct index_state *istate);
>   
> -int recursively_match_path_with_sparse_patterns(const char *path,
> -						struct index_state *istate,
> -						int dtype,
> -						struct pattern_list *pl);
> +enum pattern_match_result recursively_match_path_with_sparse_patterns(const char *path,
> +								      struct index_state *istate,
> +								      int dtype,
> +								      struct pattern_list *pl);
>   
>   struct dir_entry *dir_add_ignored(struct dir_struct *dir,
>   				  struct index_state *istate,


^ permalink raw reply

* Re: [PATCH v2] ls-tree: add --sparse-filter-oid argument
From: William Sprent @ 2023-01-24 15:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason,
	William Sprent via GitGitGadget
  Cc: git, Eric Sunshine, Derrick Stolee, Victoria Dye, Elijah Newren
In-Reply-To: <230123.861qnltmbp.gmgdl@evledraar.gmail.com>

On 23/01/2023 14.06, Ævar Arnfjörð Bjarmason wrote:
> 
> On Mon, Jan 23 2023, William Sprent via GitGitGadget wrote:
> 
>> From: William Sprent <williams@unity3d.com>
> 
> ...some further inline commentary, in addition to my proposed squash-in...
> 

Thanks for taking the time. :)

>> diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
>> index 0240adb8eec..724bb1f9dbd 100644
>> --- a/Documentation/git-ls-tree.txt
>> +++ b/Documentation/git-ls-tree.txt
>> @@ -11,6 +11,7 @@ SYNOPSIS
>>   [verse]
>>   'git ls-tree' [-d] [-r] [-t] [-l] [-z]
>>   	    [--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
>> +	    [--sparse-filter-oid=<blob-ish>]
>>   	    <tree-ish> [<path>...]
>>   
>>   DESCRIPTION
>> @@ -48,6 +49,11 @@ OPTIONS
>>   	Show tree entries even when going to recurse them. Has no effect
>>   	if `-r` was not passed. `-d` implies `-t`.
>>   
>> +--sparse-filter-oid=<blob-ish>::
>> +	Omit showing tree objects and paths that do not match the
>> +	sparse-checkout specification contained in the blob
>> +	(or blob-expression) <blob-ish>.
>> +
>>   -l::
>>   --long::
>>   	Show object size of blob (file) entries.
>> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
>> index 72eb70823d5..46a815fc7dc 100644
>> --- a/builtin/ls-tree.c
>> +++ b/builtin/ls-tree.c
>> @@ -13,6 +13,7 @@
>>   #include "builtin.h"
>>   #include "parse-options.h"
>>   #include "pathspec.h"
>> +#include "dir.h"
>>   
>>   static const char * const ls_tree_usage[] = {
>>   	N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
>> @@ -364,6 +365,65 @@ static struct ls_tree_cmdmode_to_fmt ls_tree_cmdmode_format[] = {
>>   	},
>>   };
> 
> Okey, here yo uupdate the *.txt, but not the "-h", but it's one of those
> where way say <options>.
> 
> I for one wouldn't mind some preceding change like my 423be1f83c5 (doc
> txt & -h consistency: make "commit" consistent, 2022-10-13), which in
> turn would make t/t0450-txt-doc-vs-help.sh pass for this command, but
> maybe it's too much of a digression...
> 

I don't mind doing that.

>> +	(*d) = xcalloc(1, sizeof(**d));
>> +	(*d)->fn = fn;
>> +	(*d)->pl.use_cone_patterns = core_sparse_checkout_cone;
> 
> I forgot to note in my proposed fix-up that I omitted this, but your
> tests still pass, which suggests it's either not needed, or your tests
> are lacking....
> 

Well.. The line exists to enable the cone mode optimisations. The only 
observable differences are performance and that a warning is emitted 
when cone mode is enabled in the config but the patterns given aren't in 
the cone mode pattern set.

I can look into writing a performance test that captures the performance 
differences.

I can also test for the latter behaviour. But that is testing for the 
side effect of a main behaviour to ensure that the main behaviour is 
there. Which isn't the nicest thing I can think of, but it might be 
better than nothing.

>> +	(*d)->options = options;
>> +	strbuf_init(&(*d)->full_path_buf, 0);
>> +
>> +	if (get_oid_with_context(the_repository, sparse_oid_name, GET_OID_BLOB,
>> +			&sparse_oid, &oc))
>> +		die(_("unable to access sparse blob in '%s'"), sparse_oid_name);
>> +	if (add_patterns_from_blob_to_list(&sparse_oid, "", 0, &(*d)->pl) < 0)
> 
> As noted you can avoid the malloc here, which also sidesteps this (IMO
> at last) ugly &(*v)->m syntax.
> 

Right.

>> +		die(_("unable to parse sparse filter data in %s"),
>> +		    oid_to_hex(&sparse_oid));
>> +}
>> +
>> +static void free_sparse_filter_data(struct sparse_filter_data *d)
>> +{
>> +	clear_pattern_list(&d->pl);
>> +	strbuf_release(&d->full_path_buf);
>> +	free(d);
>> +}
>> +
>> +static int path_matches_sparse_checkout_patterns(struct strbuf *full_path, struct pattern_list *pl, int dtype)
>> +{
>> +	enum pattern_match_result match = recursively_match_path_with_sparse_patterns(full_path->buf, the_repository->index, dtype, pl);
> 
> I for one would welcome e.g. abbreviating "sparse_checkout_patterns" as
> "scp" or whatever throughout, with resulting shortening of very long
> lines & identifiers. E.g. for this one "patch_match_scp" or whatever.
> 
> 

I don't mind either way, but a quick search doesn't reveal any of uses 
of an abbreviation like, though.

Either way, I could also drop the '_checkout_' from 
'sparse_checkout_patterns'. I don't think that would make it less clear 
what we are talking about.

>> +	struct sparse_filter_data *data = context;
>> +
> 
> Style: Don't add a \n\n between variable decls,
> 
>> +	int do_recurse = show_recursive(data->options, base->buf, base->len, pathname);
> 
> Instead add it here, before the code proper.
> 
> 

Alright. I'll apply that.

>> +	if (object_type(mode) == OBJ_TREE)
>> +		return do_recurse;
>> +
>> +	strbuf_reset(&data->full_path_buf);
> 
> I wondered if we need this after, but we don't, I for one would welcome a fix-up like:
> 	
> 	diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> 	index 68d6ef675f2..74dfa9a4580 100644
> 	--- a/builtin/ls-tree.c
> 	+++ b/builtin/ls-tree.c
> 	@@ -410,19 +410,21 @@ static int filter_sparse(const struct object_id *oid, struct strbuf *base,
> 	 			 const char *pathname, unsigned mode, void *context)
> 	 {
> 	 	struct sparse_filter_data *data = context;
> 	-
> 	 	int do_recurse = show_recursive(data->options, base->buf, base->len, pathname);
> 	+	int ret = 0;
> 	+
> 	 	if (object_type(mode) == OBJ_TREE)
> 	 		return do_recurse;
> 	
> 	-	strbuf_reset(&data->full_path_buf);
> 	 	strbuf_addbuf(&data->full_path_buf, base);
> 	 	strbuf_addstr(&data->full_path_buf, pathname);
> 	
> 	 	if (!path_matches_sparse_checkout_patterns(&data->full_path_buf, &data->pl, DT_REG))
> 	-			return 0;
> 	-
> 	-	return data->fn(oid, base, pathname, mode, data->options);
> 	+		goto cleanup;
> 	+	ret = data->fn(oid, base, pathname, mode, data->options);
> 	+cleanup:
> 	+	strbuf_reset(&data->full_path_buf);
> 	+	return ret;
> 	 }
> 	
> 	 int cmd_ls_tree(int argc, const char **argv, const char *prefix)
> 	
> It's slightly more verbose, and we do end up needlessly reset-ing the
> "last" one, but makes it clear that we don't have need for this after
> this.
> 
> Which to me, also raises the question of why you have this
> "full_path_buf" at all. It looks like a memory optimization, as you're
> trying to avoid a malloc()/free() in the loop.
> 
> That's fair, but you could also do that with:
> 
> 	const size_t oldlen = base->len;
> 	strbuf_addstr(base, pathname);
>          /* do the path_matches_sparse_checkout_patterns() call here */
> 	/* before "cleanup" */
> 	strbuf_setlen(base, oldlen);
> 
> Or whatever, those snippets are untested, but we use similar tricks
> elsewhere, and as it's contained to a few lines here I think it's better
> than carrying another buffer.
> 

Okay. That makes sense. And if it is a common trick then I guess it 
isn't too mysterious.

>> +	strbuf_addbuf(&data->full_path_buf, base);
>> +	strbuf_addstr(&data->full_path_buf, pathname);
> 
> Nit: Shorter as: strbuf_addf(&sb, "%s%s", base.buf, pathname) (or equivalent);
> 
>> +
>> +	if (!path_matches_sparse_checkout_patterns(&data->full_path_buf, &data->pl, DT_REG))
>> +			return 0;
>> +
>> +	return data->fn(oid, base, pathname, mode, data->options);
>> +}
>> +
>>   int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>>   {
>>   	struct object_id oid;
>> @@ -372,7 +432,11 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>>   	read_tree_fn_t fn = NULL;
>>   	enum ls_tree_cmdmode cmdmode = MODE_DEFAULT;
>>   	int null_termination = 0;
>> +
> 
> Don't add this stray \n.
> 

Ok.

>>   	struct ls_tree_options options = { 0 };
>> +	char *sparse_oid_name = NULL;
>> +	void *read_tree_context = NULL;
>> +	struct sparse_filter_data *sparse_filter_data = NULL;
>>   	const struct option ls_tree_options[] = {
>>   		OPT_BIT('d', NULL, &options.ls_options, N_("only show trees"),
>>   			LS_TREE_ONLY),
>> @@ -399,6 +463,9 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>>   					 N_("format to use for the output"),
>>   					 PARSE_OPT_NONEG),
>>   		OPT__ABBREV(&options.abbrev),
>> +		OPT_STRING_F(0, "filter-sparse-oid", &sparse_oid_name, N_("filter-sparse-oid"),
> 
> Make that N_(...) be "object-id", "oid", "hash" or something?
> 
> I.e. the RHS with the <type> should be <thingy>, not
> <thingy-for-some-purpose>.
> 

Right. I've used the "blob-ish" wording in the .txt file, I think it 
makes sense to reuse it here.

>> +			   N_("filter output with sparse-checkout specification contained in the blob"),
>> +			     PARSE_OPT_NONEG),
>>   		OPT_END()
>>   	};
>>   	struct ls_tree_cmdmode_to_fmt *m2f = ls_tree_cmdmode_format;
>> @@ -474,7 +541,17 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
>>   		break;
>>   	}
>>   
>> -	ret = !!read_tree(the_repository, tree, &options.pathspec, fn, &options);
>> +	if (sparse_oid_name) {
>> +		init_sparse_filter_data(&sparse_filter_data, &options, sparse_oid_name, fn);
>> +		read_tree_context = sparse_filter_data;
>> +		fn = filter_sparse;
>> +	} else
>> +		read_tree_context = &options;
> 
> You're missing a {} here for the "else".
> 
> Better yet, delete that "else" and change the decl above to be:
> 
> 	void *context = &options;
> 
> Then just keep the "if" here, where you might clobber the "context".
> 

I'll delete the else as you suggest.

>> +
>> +	ret = !!read_tree(the_repository, tree, &options.pathspec, fn, read_tree_context);
>>   	clear_pathspec(&options.pathspec);
>> +	if (sparse_filter_data)
>> +		free_sparse_filter_data(sparse_filter_data);
> 
> I suggested a fixup for this, but as an aside don't make a free_*()
> function that's not happy to accept NULL, it should work like the free()
> itself.
> 

Ah. That's a fair point. Thanks.

>> +
>>   	return ret;
>>   }
>> diff --git a/dir.c b/dir.c
>> index 4e99f0c868f..122ebced08e 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -1457,45 +1457,50 @@ int init_sparse_checkout_patterns(struct index_state *istate)
>>   	return 0;
>>   }
>>   
>> -static int path_in_sparse_checkout_1(const char *path,
>> -				     struct index_state *istate,
>> -				     int require_cone_mode)
>> +int recursively_match_path_with_sparse_patterns(const char *path,
>> +						struct index_state *istate,
>> +						int dtype,
>> +						struct pattern_list *pl)
>>   {
>> -	int dtype = DT_REG;
>>   	enum pattern_match_result match = UNDECIDED;
>>   	const char *end, *slash;
>> -
>> -	/*
>> -	 * We default to accepting a path if the path is empty, there are no
>> -	 * patterns, or the patterns are of the wrong type.
>> -	 */
>> -	if (!*path ||
>> -	    init_sparse_checkout_patterns(istate) ||
>> -	    (require_cone_mode &&
>> -	     !istate->sparse_checkout_patterns->use_cone_patterns))
>> -		return 1;
>> -
>>   	/*
>>   	 * If UNDECIDED, use the match from the parent dir (recursively), or
>>   	 * fall back to NOT_MATCHED at the topmost level. Note that cone mode
>>   	 * never returns UNDECIDED, so we will execute only one iteration in
>>   	 * this case.
>>   	 */
>> -	for (end = path + strlen(path);
>> -	     end > path && match == UNDECIDED;
>> +	for (end = path + strlen(path); end > path && match == UNDECIDED;
> 
> All good, aside from this whitespace refactoring, as noted.
> 

Ok.

>>   	     end = slash) {
>> -
>>   		for (slash = end - 1; slash > path && *slash != '/'; slash--)
>>   			; /* do nothing */
>>   
>>   		match = path_matches_pattern_list(path, end - path,
>>   				slash > path ? slash + 1 : path, &dtype,
>> -				istate->sparse_checkout_patterns, istate);
>> +				pl, istate);
>>   
>>   		/* We are going to match the parent dir now */
>>   		dtype = DT_DIR;
>>   	}
>> -	return match > 0;
>> +
>> +	return match;
>> +}
>> +
>> +static int path_in_sparse_checkout_1(const char *path,
>> +				     struct index_state *istate,
>> +				     int require_cone_mode)
>> +{
>> +	/*
>> +	 * We default to accepting a path if the path is empty, there are no
>> +	 * patterns, or the patterns are of the wrong type.
>> +	 */
>> +	if (!*path ||
>> +	    init_sparse_checkout_patterns(istate) ||
>> +	    (require_cone_mode &&
>> +	     !istate->sparse_checkout_patterns->use_cone_patterns))
>> +		return 1;
>> +
>> +	return recursively_match_path_with_sparse_patterns(path, istate, DT_REG, istate->sparse_checkout_patterns) > 0;
> 
> All good, except for the really long line, aside from the previous
> suggestion, maybe pull "istate->sparse_checkout_patterns" into a
> variable, as it's now used twice here in this function?
> 

I think that makes sense.

>> +check_agrees_with_ls_files () {
>> +	REPO=repo  &&
>> +	git -C $REPO submodule deinit -f --all &&
>> +	git -C $REPO cat-file -p $filter_oid >"$REPO/.git/info/sparse-checkout" &&
>> +	git -C $REPO sparse-checkout init --cone &&
>> +	git -C $REPO submodule init &&
>> +	git -C $REPO ls-files -t >out &&
>> +	sed -n "/^S /!s/^. //p" out >expect &&
>> +	test_cmp expect actual
> 
> Instead of this last "sed" munging, why not use the "--format" option to
> "ls-tree" to just make the output the same?

I hadn't thought about changing that side of the output. I would have to 
call ls-tree once more instead of relying on the 'expect' file to be 
there when the function is called. But that might be nicer anyway.


^ permalink raw reply

* Re: [PATCH 1/2] gitk: prevent overly long command lines
From: Junio C Hamano @ 2023-01-24 15:38 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Paul Mackerras, Johannes Schindelin
In-Reply-To: <89fe0380cd3a373e7e23d663b506466fd6cb5fb6.1674559397.git.gitgitgadget@gmail.com>

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> To avoid running into command line limitations, some of Git's commands
> support the `--stdin` option.
>
> Let's use exactly this option in the three rev-list/log invocations in
> gitk that would otherwise possibly run the danger of trying to invoke a
> too-long command line.

Makes perfect sense.  I do not know the point of saying exactly
here, though.

> While it is easy to redirect either stdin or stdout in Tcl/Tk scripts,
> what we need here is both. We need to capture the output, yet we also
> need to pipe in the revs/files arguments via stdin (because stdin does
> not have any limit, unlike the command line). To help this, we use the
> neat Tcl feature where you can capture stdout and at the same time feed
> a fixed string as stdin to the spawned process.

Nice, so this is not about "we may have too many args to fit in
our memory", but about "we may have too many args for system to
spawn the subprocess with".

> One non-obvious aspect about this change is that the `--stdin` option
> allows to specify revs, the double-dash, and files, but *no* other
> options such as `--not`.

It sounds like a design mistake, which may want to be fixed, but of
course gitk cannot depend on Git it runs with having such a fix, and
use of "^" prefix is a good alternative (after all, "--not" was
invented to save us writing ^ in front of many revs).

Good.


^ permalink raw reply

* Re: [PATCH v4 5/9] rebase: add coverage of other incompatible options
From: Elijah Newren @ 2023-01-24 15:41 UTC (permalink / raw)
  To: phillip.wood
  Cc: Elijah Newren via GitGitGadget, git, Derrick Stolee,
	Eric Sunshine, Martin Ågren, Phillip Wood
In-Reply-To: <7b9ee972-2680-2e1b-bef3-201d8a1e4bdd@dunelm.org.uk>

Hi Phillip,

On Tue, Jan 24, 2023 at 5:16 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> >>>> Signed-off-by: Elijah Newren <newren@gmail.com>
> >>>> --- a/builtin/rebase.c
> >>>> +++ b/builtin/rebase.c
> >>>> @@ -1224,6 +1224,26 @@ int cmd_rebase(int argc, const char **argv,
> >>>> const char *prefix)
> >>>>                if (options.fork_point < 0)
> >>>>                        options.fork_point = 0;
> >>>>        }
> >>>> +     /*
> >>>> +      * The apply backend does not support
> >>>> --[no-]reapply-cherry-picks.
> >>>> +      * The behavior it implements by default is equivalent to
> >>>> +      * --no-reapply-cherry-picks (due to passing --cherry-picks to
> >>>> +      * format-patch), but --keep-base alters the upstream such
> >>>> that no
> >>>> +      * cherry-picks can be found (effectively making it act like
> >>>> +      * --reapply-cherry-picks).
> >>>> +      *
> >>>> +      * Now, if the user does specify --[no-]reapply-cherry-picks, but
> >>>> +      * does so in such a way that options.reapply_cherry_picks ==
> >>>> +      * keep_base, then the behavior they get will match what they
> >>>> +      * expect despite options.reapply_cherry_picks being ignored.  We
> >>>> +      * could just allow the flag in that case, but it seems better to
> >>>> +      * just alert the user that they've specified a flag that the
> >>>> +      * backend ignores.
> >>>> +      */
> >>>
> >>> I'm a bit confused by this. --keep-base works with either
> >>> --reapply-cherry-picks (which is the default if --keep-base is given) or
> >>> --no-reapply-cherry-picks. Just below this hunk we have
> >>>
> >>>          if (options.reapply_cherry_picks < 0)
> >>>                  options.reapply_cherry_picks = keep_base;
> >>>
> >>> So we only set options.reapply_cherry_picks to match keep_base if the
> >>> user did not specify -[-no]-reapply-cherry-picks on the commandline.
> >>
> >> options.reapply_cherry_picks is totally ignored by the apply backend,
> >> regardless of whether it's set by the user or the setup code in
> >> builtin/rebase.c.  And if we have an option which is ignored, isn't it
> >> nicer to provide an error message to the user if they tried to set it?
> >>
> >> Said another way, while users could start with these command lines:
> >>
> >>      (Y) git rebase --whitespace=fix
> >>      (Z) git rebase --whitespace=fix --keep-base
> >>
> >> and modify them to include flags that would be ignored, we could allow:
> >>
> >>      (A) git rebase --whitespace=fix --no-reapply-cherry-picks
> >>      (B) git rebase --whitespace=fix --keep-base --reapply-cherry-picks
> >>
> >> But we could not allow commands like
> >>
> >>      (C) git rebase --whitespace=fix --reapply-cherry-picks
> >>      (D) git rebase --whitespace=fix --keep-base
> >> --no-reapply-cherry-picks
> >
> > (C) is already an error
> > (D) is currently allowed and I think works as expected (--keep-base only
> > implies --reapply-cherry-picks, the user is free to override that with
> > --no-reapply-cherry-picks) so I don't see why we'd want to make it an
> > error.

Ah, despite looking over the code multiple times to check my
statements, I somehow kept missing this:

    if (keep_base && options.reapply_cherry_picks)
        options.upstream = options.onto;

which is how --[no-]reapply-cherry-picks is supported in conjunction
with --keep-base.  Thanks.

> >> For all four cases (A)-(D), the apply backend will ignore whatever
> >> --[no-]reapply-cherry-picks flag is provided.
> >
> > For (D) the flag is respected, (C) errors out, the other cases
> > correspond to the default so it's like saying
> >
> >      git rebase --merge --no-reapply-cherry-picks
> >
> > ignores the flag.
>
> On reflection that is only true for (B). I agree that we should error
> out on (A) which we don't at the moment.
>
> I'd support a change that errors out on (A) and (C) but continues to
> allow (B) and (D). I think we can do that with the diff below
>
>
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 1481c5b6a5..66aef356b8 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -1230,12 +1230,6 @@ int cmd_rebase(int argc, const char **argv, const
> char *prefix)
>                   if (options.fork_point < 0)
>                           options.fork_point = 0;
>           }
> -        /*
> -         * --keep-base defaults to --reapply-cherry-picks to avoid losing
> -         * commits when using this option.
> -         */
> -        if (options.reapply_cherry_picks < 0)
> -                options.reapply_cherry_picks = keep_base;
>
>           if (options.root && options.fork_point > 0)
>                   die(_("options '%s' and '%s' cannot be used
> together"), "--root", "--fork-point");
> @@ -1412,11 +1406,17 @@ int cmd_rebase(int argc, const char **argv,
> const char *prefix)
>           if (options.empty != EMPTY_UNSPECIFIED)
>                   imply_merge(&options, "--empty");
>
> -        /*
> -         * --keep-base implements --reapply-cherry-picks by altering
> upstream so
> -         * it works with both backends.
> -         */
> -        if (options.reapply_cherry_picks && !keep_base)
> +        if (options.reapply_cherry_picks < 0)
> +                /*
> +                 * --keep-base defaults to --reapply-cherry-picks to
> +                 * avoid losing commits when using this option.
> +                 */

I know you were copying the previous comment, but this comment is
really confusing to me.  Shouldn't it read "--reapply-cherry-picks
defaults to --keep-base" instead of vice-versa?

> +                options.reapply_cherry_picks = keep_base;
> +        else if (!keep_base)
> +                /*
> +                 * --keep-base implements --reapply-cherry-picks by

Should this be --[no-]reapply-cherry-picks, to clarify that it handles
both cases?  Especially given how many times I missed it?

> +                 * altering upstream so it works with both backends.
> +                 */
>                   imply_merge(&options, "--reapply-cherry-picks");

And perhaps this should be

    imply_merge(&options, options.reapply_cherry_picks ?
"--reapply-cherry-picks" :
         "--no-reapply-cherry-picks");

Also, the comment in git-rebase.txt about incompatibilities would become

     * --[no-]reapply-cherry-picks, when --keep-base is not provided

^ permalink raw reply

* git not allowing 744 as permissions for a file
From: Auriane Reverdell @ 2023-01-24 15:48 UTC (permalink / raw)
  To: git

Hi,

git doesn't allow to add the execution permission on a file only for
the user. A chmod 744 on a file will transform into 755 when added to
git. This can potentially lead to security problems on certain
systems. Is there a way to fix that? I'll be happy to do so if
somebody shows me where to do it.

Auriane R.

^ permalink raw reply

* Re: GSoC 2023
From: Christian Couder @ 2023-01-24 16:38 UTC (permalink / raw)
  To: Kaartic Sivaraam; +Cc: git, Derrick Stolee, Victoria Dye, Hariom verma
In-Reply-To: <d8ce0159-c9dc-25c2-4180-70518bb31bfc@gmail.com>

Hi,

On Mon, Jan 23, 2023 at 6:41 PM Kaartic Sivaraam
<kaartic.sivaraam@gmail.com> wrote:
> On 17/01/23 15:04, Christian Couder wrote:
> >
> > GSoC Org Applications open next week on Monday, January 23rd at 1800
> > UTC and close on Tuesday, February 7th at 1800 UTC.
> >
> > I am interested in mentoring and being an org admin for Git again this
> > year, so I plan to apply for Git soon.
>
> I would be glad to help as an Org admin this year. I don't suppose I
> have the capacity to mentor / co-mentor this year, though. I could very
> well help as much as I can passively where needed.

You are welcome to be an Org Admin, thanks!

Actually you were already an Org Admin last year and it looks like
they didn't remove people from the roles they had last year, so you
are still an Org Admin.

> One thing to note about the Org application. As per Google's claim we
> should be able to complete this year's application quickly since the new
> webapp allows us to reuse last year's application details.

Yeah, I just did it and reusing last year's application details was
quite quick. One needs to read, agree on and accept 3 long documents,
and they still ask a few questions about last year's GSoC though.

They also want an URL with an idea list, so I quickly copied and
edited last year's idea list into this one:

https://git.github.io/SoC-2023-Ideas/

I removed the "Reachability bitmap improvements" idea but left the 2 others:

  - More Sparse Index Integrations (I removed `git mv` in the list of
commands that need to be improved though)
  - Unify ref-filter formats with other pretty formats

On both of them I removed all possible mentors except me though. They
are welcome to tell me that they should be added back.

> We certainly are in need of ideas and mentors for this year. Do chime in
> with your thoughts. :-)

Yeah, sure.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox