Git development
 help / color / mirror / Atom feed
* [PATCH 1/3] diff --no-index: die on error reading stdin
From: Phillip Wood @ 2023-06-27 14:10 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood
In-Reply-To: <cover.1687874975.git.phillip.wood@dunelm.org.uk>

From: Phillip Wood <phillip.wood@dunelm.org.uk>

If there is an error when reading from stdin then "diff --no-index"
prints an error message but continues to try and diff a file named "-"
resulting in an error message that looks like

    error: error while reading from stdin: Invalid argument
    fatal: stat '-': No such file or directory

assuming that no file named "-" exists. If such a file exists it prints
the first error message and generates the diff from that file which is
not what the user wanted. Instead just die() straight away if we cannot
read from stdin.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 diff-no-index.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 4296940f90..7b9327f8f3 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -60,20 +60,19 @@ static int get_mode(const char *path, int *mode)
 	return 0;
 }
 
-static int populate_from_stdin(struct diff_filespec *s)
+static void populate_from_stdin(struct diff_filespec *s)
 {
 	struct strbuf buf = STRBUF_INIT;
 	size_t size = 0;
 
 	if (strbuf_read(&buf, 0, 0) < 0)
-		return error_errno("error while reading from stdin");
+		die_errno("error while reading from stdin");
 
 	s->should_munmap = 0;
 	s->data = strbuf_detach(&buf, &size);
 	s->size = size;
 	s->should_free = 1;
 	s->is_stdin = 1;
-	return 0;
 }
 
 static struct diff_filespec *noindex_filespec(const char *name, int mode)
-- 
2.40.1.852.g22d29fd9ba


^ permalink raw reply related

* [PATCH 0/3] diff --no-index: support reading from named pipes
From: Phillip Wood @ 2023-06-27 14:10 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:

  diff -u <(printf "a\nb\n") <(printf "a\nc\n")

However, this syntax does not produce useful results with git diff
--no-index. On macOS, the arguments to the command are named pipes
under /dev/fd, and git diff doesn't know how to handle a named
pipe. On Linux, the arguments are symlinks to pipes, so git diff
"helpfully" diffs these symlinks, comparing their targets like
"pipe:[1234]" and "pipe:[5678]".

There have been at least three previous attempts [1-3] to address this
issue. They all seem to have received broad support for the aim of
supporting process substitutions but have foundered on details of the
implementation. In an effort to avoid the same fate this series is
narrowly focussed on making command substitutions work with "diff
--no-index" and does not attempt to add a general facility for
de-referencing symbolic links or reading from pipes to the diff
machinery.

The only functional change is that if a path given on the commandline
is a named pipe or a symbolic link that resolves to a named pipe then
we read the data to diff from that pipe. The first two patches improve
the error handling when reading from stdin and add a test. The third
patch implements support for reading from pipes.

This cover letter and the commit message for the third patch are
largely copied from brian’s patch[2] - do we have a standard commit
trailer for "I stole the commit message from someone else's patch"?

I've cc’d the participants of the discussion of the last attempt[1] to
fix this.

[1] https://lore.kernel.org/git/20200918113256.8699-3-tguyot@gmail.com/
[2] https://lore.kernel.org/git/20181220002610.43832-1-sandals@crustytoothpaste.net/
[3] https://public-inbox.org/git/20170113102021.6054-1-dennis@kaarsemaker.net/


Base-Commit: 94486b6763c29144c60932829a65fec0597e17b3
Published-As: https://github.com/phillipwood/git/releases/tag/diff-no-index-pipes%2Fv1
View-Changes-At: https://github.com/phillipwood/git/compare/94486b676...990e71882
Fetch-It-Via: git fetch https://github.com/phillipwood/git diff-no-index-pipes/v1

Phillip Wood (3):
  diff --no-index: die on error reading stdin
  t4054: test diff --no-index with stdin
  diff --no-index: support reading from named pipes

 diff-no-index.c          | 83 ++++++++++++++++++++++++----------------
 t/t4053-diff-no-index.sh | 43 +++++++++++++++++++++
 2 files changed, 92 insertions(+), 34 deletions(-)

-- 
2.40.1.852.g22d29fd9ba


^ permalink raw reply

* Re: [PATCH] ll-merge: killing the external merge driver aborts the merge
From: Joshua Hudson @ 2023-06-27 13:43 UTC (permalink / raw)
  To: Johannes Schindelin, Junio C Hamano; +Cc: git, Elijah Newren
In-Reply-To: <59b7a582-be68-3f7b-a06f-3bd662582a1d@gmx.de>


On 6/27/2023 5:02 AM, Johannes Schindelin wrote:
> Hi,
>
>
> On Fri, 23 Jun 2023, Junio C Hamano wrote:
>
>> Junio C Hamano <gitster@pobox.com> writes:
>>
>>> Elijah Newren <newren@gmail.com> writes:
>>>
>>>> Reviewed-by: Elijah Newren <newren@gmail.com>
>>>
>>> Thanks for a quick review.
>> Unfortunately Windows does not seem to correctly detect the aborting
>> merge driver.  Does run_command() there report process death due to
>> signals differently, I wonder?
>>
>> https://github.com/git/git/actions/runs/5360400800/jobs/9725341775#step:6:285
>>
>> shows that on Windows, aborted external merge driver is not noticed
>> and we happily take the auto-merged result, ouch.
> Hmm. I tried to verify this, but failed. With this patch:
>
> ```diff
> diff --git a/git.c b/git.c
> index 2f42da20f4e0..3c513e3f2cb1 100644
> --- a/git.c
> +++ b/git.c
> @@ -330,6 +330,8 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
>   			setenv(GIT_ATTR_SOURCE_ENVIRONMENT, cmd, 1);
>   			if (envchanged)
>   				*envchanged = 1;
> +		} else if (!strcmp(cmd, "--abort")) {
> +			abort();
>   		} else {
>   			fprintf(stderr, _("unknown option: %s\n"), cmd);
>   			usage(git_usage_string);
> ```
>
> I get this:
>
>
> ```console
> $ ./git.exe --abort
>
> $ echo $?
> 3
> ```
>
> For that reason, I am somehow doubtful that the `abort()` is actually
> called?!?
>
> Ciao,
> Johannes

abort(); does _exit(3); on Windows because there are no signals. This is 
easily changed by providing abort like so:

void abort() { _exit(131 /* or whatever else you think goes here */); }


^ permalink raw reply

* [RFC PATCH v2] khash_test.c : add unit tests
From: Siddharth Singh @ 2023-06-27 13:20 UTC (permalink / raw)
  To: git
In-Reply-To: <20230531155142.3359886-1-siddhartth@google.com>

Signed-off-by: Siddharth Singh <siddhartth@google.com>
---
Since v1
- rewrote the code keeping coding style guidelines in mind.
- refactored tests to improve their implementation..
- added a few more tests that cause collisions in the hashmap.
In the v1 patch, there were concerns that new tests were unnecessary because of upstream tests. However, I believe it would be beneficial to have tests based on the khash implementation in the git codebase because the existing tests[1] for khash do not use the same code for khash as the git codebase. 
E.g. in khash.h file of “attractivechaos/klib/khash.h” [2]. There's an implementation for `KHASH_MAP_INIT_INT64` which isn’t defined in “git/git/khash.h”[3]. So, there’s a possibility that the khash.h in a different repository might move forward and end up being different from out implementation, so writing tests for “git/git/khash.h” would ensure that our tests are relevant to the actual usage of the library.
While my colleagues are currently investigating whether C Tap harness is the best way to test libified code, this RFC is intended to discuss the content of unit tests and what other tests would be useful for khash.h. I am unsure of what tests will be valuable in the future, so I would appreciate your thoughts on the matter. Many test cases are easier to construct in C TAP harness than in test-tool. For example, In C TAP harness, non-string values can be used directly in hash maps. However, in test-tool, non-string values must be passed as an argument to a shell executable, parsed into the correct type, and then operated on.
I'm also very new to writing unit tests in C and git codebase in general, so I'll appreciate constructive feedback and discussion..
With this RFC, I hope you can help me answer these questions.
What are the unit test cases to include in khash.h?
What are the other types of tests that would be useful for khash.h?
[1] https://github.com/attractivechaos/klib/blob/master/test/khash_test.c
[2] https://github.com/attractivechaos/klib/blob/master/khash.h
[3] https://github.com/git/git/blob/master/khash.h

 Makefile       |   2 +
 t/khash_test.c | 214 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 216 insertions(+)
 create mode 100644 t/khash_test.c

diff --git a/Makefile b/Makefile
index 660c72d72e..d3ad2fa23c 100644
--- a/Makefile
+++ b/Makefile
@@ -3847,11 +3847,13 @@ $(UNIT_TEST_RUNNER): t/runtests.c
 
 UNIT_TEST_PROGS += t/unit-tests/unit-test-t
 UNIT_TEST_PROGS += t/unit-tests/strbuf-t
+UNIT_TEST_PROGS += t/unit-tests/khash-t
 
 $(UNIT_TEST_PROGS): $(CTAP_OBJS) $(LIBS)
 	$(QUIET)mkdir -p t/unit-tests
 	$(QUIET_CC)$(CC) -It -o t/unit-tests/unit-test-t t/unit-test.c $(CTAP_OBJS)
 	$(QUIET_CC)$(CC) -It -o t/unit-tests/strbuf-t t/strbuf-test.c -DSKIP_COMMON_MAIN common-main.c $(CTAP_OBJS) $(LIBS)
+	$(QUIET_CC)$(CC) -It -o t/unit-tests/khash-t t/khash_test.c -DSKIP_COMMON_MAIN common-main.c $(CTAP_OBJS) $(LIBS)
 
 .PHONY: unit-tests
 unit-tests: $(UNIT_TEST_PROGS) $(UNIT_TEST_RUNNER)
diff --git a/t/khash_test.c b/t/khash_test.c
new file mode 100644
index 0000000000..d1a43add13
--- /dev/null
+++ b/t/khash_test.c
@@ -0,0 +1,214 @@
+#include "../git-compat-util.h"
+#include "../khash.h"
+#include <tap/basic.h>
+
+/*
+  These tests are for base assumptions; if they fails, library is broken
+*/
+int hash_string_succeeds() {
+  const char *str = "test_string";
+  khint_t value = __ac_X31_hash_string(str);
+  return value == 0xf1a6305e;
+}
+
+int hash_string_with_non_alphabets_succeeds() {
+  const char *str = "test_string #2";
+  khint_t value = __ac_X31_hash_string(str);
+  return value == 0xfa970771;
+}
+
+KHASH_INIT(str, const char *, int *, 1, kh_str_hash_func, kh_str_hash_equal);
+
+int initialized_hashmap_pointer_not_null() {
+  kh_str_t *hashmap = kh_init_str();
+
+  if(hashmap != NULL){
+    free(hashmap);
+    return 1;
+  }
+  return 0;
+}
+
+int put_key_into_hashmap_once_succeeds() {
+  int ret, value;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  pos = kh_put_str(hashmap, "test_key", &ret);
+  if(!ret)
+    return 0;
+  return pos != 0;
+}
+
+int put_same_key_into_hashmap_twice_fails() {
+  int first_return_value, second_return_value, value;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  kh_put_str(hashmap, "test_key", &first_return_value);
+  kh_put_str(hashmap, "test_key", &second_return_value);
+  return !second_return_value;
+}
+
+int put_value_into_hashmap_once_succeeds() {
+  int ret, value = 14;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  pos = kh_put_str(hashmap, "test_key", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos) = xstrdup("test_key");
+  kh_value(hashmap, pos) = &value;
+  return *kh_value(hashmap, pos) == value;
+}
+
+int put_same_value_into_hashmap_twice_fails() {
+  int first_return_value, second_return_value, value = 14;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  pos = kh_put_str(hashmap, "test_key", &first_return_value);
+  if (!first_return_value)
+    return 0;
+  kh_key(hashmap, pos) = xstrdup("test_key");
+  kh_value(hashmap, pos) = &value;
+  kh_put_str(hashmap, "test_key", &second_return_value);
+  return !second_return_value;
+}
+
+int get_existing_kv_from_hashmap_succeeds() {
+  int ret, value =14;
+  int expected = value;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  pos = kh_put_str(hashmap, "test_key", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos) = xstrdup("test_key");
+  kh_value(hashmap, pos) = &value;
+  return *kh_value(hashmap, kh_get_str(hashmap, "test_key")) == expected;
+}
+
+int get_nonexisting_kv_from_hashmap_fails() {
+  int value = 14;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  return !kh_get_str(hashmap, "test_key");
+}
+
+int deletion_from_hashmap_sets_flag() {
+  int ret, value = 14;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  pos = kh_put_str(hashmap, "test_key", &ret);
+  if (!ret)
+    return 0;
+  if(!kh_exist(hashmap, pos))
+    return 0;
+  kh_key(hashmap, pos) = xstrdup("test_key");
+  kh_value(hashmap, pos) = &value;
+  kh_del_str(hashmap, pos);
+  return !kh_exist(hashmap, pos);
+}
+
+int deletion_from_hashmap_updates_size() {
+  int ret, value = 14, current_size;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  pos = kh_put_str(hashmap, "test_key", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos) = xstrdup("test_key");
+  kh_value(hashmap, pos) = &value;
+  current_size = hashmap->size;
+  kh_del_str(hashmap, pos);
+  return hashmap->size + 1 == current_size;
+}
+
+int deletion_from_hashmap_does_not_update_kv() {
+  int ret, value = 14, current_size;
+  khint_t pos;
+  kh_str_t *hashmap = kh_init_str();
+  pos = kh_put_str(hashmap, "test_key", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos) = xstrdup("test_key");
+  kh_value(hashmap, pos) = &value;
+  kh_del_str(hashmap, pos);
+  return !strcmp(kh_key(hashmap, pos),"test_key");
+}
+
+int update_value_after_deletion_succeeds() {
+  int ret, value1 = 14, value2 = 15;
+  khint_t pos1, pos2;
+  kh_str_t *hashmap = kh_init_str();
+  // adding the kv to the hashmap
+  pos1 = kh_put_str(hashmap, "test_key", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos1) = xstrdup("test_key");
+  kh_value(hashmap, pos1) = &value1;
+  // delete the kv from the hashmap
+  kh_del_str(hashmap, pos1);
+  // adding the same key with different value
+  pos2 = kh_put_str(hashmap, "test_key", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos2) = xstrdup("test_key");
+  kh_value(hashmap, pos2) = &value2;
+  // check if the value is different
+  return *kh_value(hashmap, kh_get_str(hashmap, "test_key")) == value2;
+}
+
+int hashmap_size_matches_number_of_added_elements() {
+  int ret, value1 = 14, value2 = 15, value3 = 16;
+  khint_t pos1, pos2, pos3;
+  kh_str_t *hashmap = kh_init_str();
+  pos1 = kh_put_str(hashmap, "test_key1", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos1) = xstrdup("test_key1");
+  kh_value(hashmap, pos1) = &value1;
+  pos2 = kh_put_str(hashmap, "test_key2", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos2) = xstrdup("test_key2");
+  kh_value(hashmap, pos2) = &value2;
+  pos3 = kh_put_str(hashmap, "test_key3", &ret);
+  if (!ret)
+    return 0;
+  kh_key(hashmap, pos3) = xstrdup("test_key3");
+  kh_value(hashmap, pos3) = &value3;
+  return kh_size(hashmap) == 3;
+}
+
+int main(void) {
+  plan(14);
+
+  ok(hash_string_succeeds(), "X31 hash value of a string is correct");
+  ok(hash_string_with_non_alphabets_succeeds(),
+     "Get X31 hash value for a string with non-alphabets in it.");
+  ok(initialized_hashmap_pointer_not_null(),
+     "Successfully initialize a hashmap and it's not NULL");
+  ok(put_key_into_hashmap_once_succeeds(),
+     "Put a key into hashmap that doesn't exist.");
+  ok(put_same_key_into_hashmap_twice_fails(),
+     "Put a key into hashmap twice that causes collision.");
+  ok(put_value_into_hashmap_once_succeeds(),
+     "Put a unique key-value pair into hashmap.");
+  ok(put_same_value_into_hashmap_twice_fails(),
+     "Put a key-value pair into hashmap twice that causes collision.");
+  ok(get_existing_kv_from_hashmap_succeeds(),
+     "Get the value from a key that exists in hashmap");
+  ok(get_nonexisting_kv_from_hashmap_fails(),
+     "Get the value from a key that doesn't exist in hashmap");
+  ok(deletion_from_hashmap_sets_flag(),
+     "Delete the key-value from hashmap and the flag is set to false");
+  ok(deletion_from_hashmap_updates_size(),
+     "Delete the key-value from hashmap and the size changes");
+  ok(deletion_from_hashmap_does_not_update_kv(),
+     "Delete the key-value from hashmap but it can be fetched using iterator");
+  ok(update_value_after_deletion_succeeds(),
+     "Updates the value of a key after deleting it");
+  ok(hashmap_size_matches_number_of_added_elements(),
+     "size of hashmap is correct after adding 3 elements");
+
+  return 0;
+}
-- 
2.40.GIT


^ permalink raw reply related

* Re: [PATCH] fix cherry-pick/revert status when doing multiple commits
From: Phillip Wood @ 2023-06-27 13:13 UTC (permalink / raw)
  To: Jacob Keller, git, Phillip Wood; +Cc: Jacob Keller
In-Reply-To: <20230621220754.126704-1-jacob.e.keller@intel.com>

Hi Jacob

On 21/06/2023 23:07, Jacob Keller wrote:
> From: Jacob Keller <jacob.keller@gmail.com>
> 
> The status report for an in-progress cherry-pick does not show the
> current commit if the cherry-pick happens as part of a series of
> multiple commits:
> 
>   $ git cherry-pick <commit1> <commit2>
>   < one of the cherry-picks fails to merge clean >
>   Cherry-pick currently in progress.
>    (run "git cherry-pick --continue" to continue)
>    (use "git cherry-pick --skip" to skip this patch)
>    (use "git cherry-pick --abort" to cancel the cherry-pick operation)
> 
>   $ git status
>   On branch <branch>
>   Your branch is ahead of '<upstream>' by 1 commit.
>     (use "git push" to publish your local commits)
> 
>   Cherry-pick currently in progress.
>     (run "git cherry-pick --continue" to continue)
>     (use "git cherry-pick --skip" to skip this patch)
>     (use "git cherry-pick --abort" to cancel the cherry-pick operation)
> 
> The show_cherry_pick_in_progress() function prints "Cherry-pick
> currently in progress". That function does have a more verbose print
> based on whether the cherry_pick_head_oid is null or not. If it is not
> null, then a more helpful message including which commit is actually
> being picked is displayed.
> 
> The introduction of the "Cherry-pick currently in progress" message
> comes from 4a72486de97b ("fix cherry-pick/revert status after commit",
> 2019-04-17). This commit modified wt_status_get_state() in order to
> detect that a cherry-pick was in progress even if the user has used `git
> commit` in the middle of the sequence.
> 
> The check used to detect this is the call to sequencer_get_last_command.
> If the sequencer indicates that the lass command was a REPLAY_PICK, then
> the state->cherry_pick_in_progress is set to 1 and the
> cherry_pick_head_oid is initialized to the null_oid. Similar behavior is
> done for the case of REPLAY_REVERT.
> 
> It happens that this call of sequencer_get_last_command will always
> report the action even if the user hasn't interrupted anything. Thus,
> during a range of cherry-picks or reverts, the cherry_pick_head_oid and
> revert_head_oid will always be overwritten and initialized to the null
> oid.
> 
> This results in status always displaying the terse message which does
> not include commit information.
> 
> Fix this by adding an additional check so that we do not re-initialize
> the cherry_pick_head_oid or revert_head_oid if we have already set the
> cherry_pick_in_progress or revert_in_progress bits. This ensures that
> git status will display the more helpful information when its available.
> Add a test case covering this behavior.

Thanks for the detailed explanation, I agree with your diagnosis and 
fix. The test case you mention seems to be missing though. I've left one 
small comment below

> Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
> ---
>   wt-status.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/wt-status.c b/wt-status.c
> index 068b76ef6d96..1e2daca73024 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -1790,10 +1790,11 @@ void wt_status_get_state(struct repository *r,
>   		oidcpy(&state->revert_head_oid, &oid);
>   	}
>   	if (!sequencer_get_last_command(r, &action)) {
> -		if (action == REPLAY_PICK) {
> +		if (action == REPLAY_PICK && !state->cherry_pick_in_progress) {
>   			state->cherry_pick_in_progress = 1;
>   			oidcpy(&state->cherry_pick_head_oid, null_oid());
> -		} else {
> +		}
> +		if (action == REPLAY_REVERT && !state->revert_in_progress) {

I think this would be clearer as

-		} else {
+		} else if ((action == REPLAY_PICK && !state->cherry_pick_in_progress) {

not worth a re-roll on its own, but I think it is worth considering when 
you re-roll with the missing test.

Thanks for working on this,

Phillip

>   			state->revert_in_progress = 1;
>   			oidcpy(&state->revert_head_oid, null_oid());
>   		}


^ permalink raw reply

* Re: [PATCH] ll-merge: killing the external merge driver aborts the merge
From: Johannes Schindelin @ 2023-06-27 12:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Elijah Newren, Joshua Hudson
In-Reply-To: <xmqqjzvt92nw.fsf@gitster.g>

Hi,


On Fri, 23 Jun 2023, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
> > Elijah Newren <newren@gmail.com> writes:
> >
> >> Reviewed-by: Elijah Newren <newren@gmail.com>
> >
> >
> > Thanks for a quick review.
>
> Unfortunately Windows does not seem to correctly detect the aborting
> merge driver.  Does run_command() there report process death due to
> signals differently, I wonder?
>
> https://github.com/git/git/actions/runs/5360400800/jobs/9725341775#step:6:285
>
> shows that on Windows, aborted external merge driver is not noticed
> and we happily take the auto-merged result, ouch.

Hmm. I tried to verify this, but failed. With this patch:

```diff
diff --git a/git.c b/git.c
index 2f42da20f4e0..3c513e3f2cb1 100644
--- a/git.c
+++ b/git.c
@@ -330,6 +330,8 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 			setenv(GIT_ATTR_SOURCE_ENVIRONMENT, cmd, 1);
 			if (envchanged)
 				*envchanged = 1;
+		} else if (!strcmp(cmd, "--abort")) {
+			abort();
 		} else {
 			fprintf(stderr, _("unknown option: %s\n"), cmd);
 			usage(git_usage_string);
```

I get this:


```console
$ ./git.exe --abort

$ echo $?
3
```

For that reason, I am somehow doubtful that the `abort()` is actually
called?!?

Ciao,
Johannes

^ permalink raw reply related

* Re: [PATCH v1 0/4] Remove obsolete Cygwin support from git-gui
From: Johannes Schindelin @ 2023-06-27 11:51 UTC (permalink / raw)
  To: Mark Levedahl; +Cc: mdl123, git, adam, me, gitster, sunshine
In-Reply-To: <20230626165305.37488-1-mlevedahl@gmail.com>

Hi Mark,

On Mon, 26 Jun 2023, Mark Levedahl wrote:

> === This is an update, incorporating responses to Junio's and Eric's
> comments:
>   -- clarified what the "upstream" git-gui branch is
>   -- Removed some changes from patch 2 as requested by Junio, reducing
>      changes in patch 3 and patch 4
>        All code is fixed only after applying patch 4
>        Differences in patch 3 and 4 are minimimized
>    -- updated comments to clarify G4w dedicated code.
>    -- updated all comments to (hopefully) clarify points of confusion
> ===

And here is the range-diff:

    1:  00000000000 ! 1:  00000000000     git gui Makefile - remove Cygwin modifications
        @@ Metadata
         Author: Mark Levedahl <mlevedahl@gmail.com>

          ## Commit message ##
        -    git gui Makefile - remove Cygwin modiifications
        +    git gui Makefile - remove Cygwin modifications

             git-gui's Makefile hardcodes the absolute Windows path of git-gui's libraries
             into git-gui, destroying the ability to package git-gui on one machine and
        @@ Commit message
             since 2012. Also, Cygwin does not support a non-Cygwin Tcl/Tk.

             The Cygwin git maintainer disables this code, so this code is definitely
        -    not in use in the Cygwin distribution, and targets an untested /
        -    unsupportable configuration.
        +    not in use in the Cygwin distribution.

        -    The simplest approach is to just delete the Cygwin specific code as
        -    stock Cygwin needs no special handling. Do so.
        +    The simplest fix is to just delete the Cygwin specific code,
        +    allowing the Makefile to work out of the box on Cygwin. Do so.

             Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>

    2:  00000000000 ! 2:  00000000000     git-gui - remove obsolete Cygwin specific code
        @@ Commit message
             8.4.1 Windows Tcl/Tk code.  In March, 2012, that 8.4.1 package was
             replaced with a full port based upon the upstream unix/X11 code,
             since maintained up to date. The two Tcl/Tk packages are completely
        -    incompatible, and have different sygnatures.
        +    incompatible, and have different signatures.

             When Cygwin's Tcl/Tk signature changed in 2012, git-gui no longer
             detected Cygwin, so did not enable Cygwin specific code, and the POSIX
        @@ Commit message
             unix. Thus, no-one apparently noticed the existence of incompatible
             Cygwin specific code.

        -    However, since commit c5766eae6f2b002396b6cd4f85b62317b707174e in
        -    upstream git-gui, the is_Cygwin funcion does detect current Cygwin.  The
        -    Cygwin specific code is enabled, causing use of Windows rather than unix
        -    pathnames, and enabling incorrect warnings about environment variables
        -    that are not relevant for the fully functional unix/X11 Tcl/Tk. The end
        -    result is that git-gui is now incommpatible with Cygwin.
        +    However, since commit c5766eae6f in the git-gui source tree
        +    (https://github.com/prati0100/git-gui, master at a5005ded), and not yet
        +    pulled into the git repository, the is_Cygwin function does detect
        +    Cygwin using the unix/X11 Tcl/Tk.  The Cygwin specific code is enabled,
        +    causing use of Windows rather than unix pathnames, and enabling
        +    incorrect warnings about environment variables that were relevant only
        +    to the old Tcl/Tk.  The end result is that (upstream) git-gui is now
        +    incompatible with Cygwin.

        -    So, delete all Cygwin specific code (code protected by "if is_Cygwin"),
        -    thus restoring the post-2012 behaviour. Note that Cygwin specific code
        -    is required to enable file browsing and shortcut creation (supported
        -    before 2012), but is not addressed in this patch.
        +    So, delete Cygwin specific code (code protected by "if is_Cygwin") that
        +    is not needed in any form to work with the unix/X11 Tcl/Tk.
        +
        +    Cygwin specific code required to enable file browsing and shortcut
        +    creation is not addressed in this patch, does not currently work, and
        +    invocation of those items may leave git-gui in a confused state.

             Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>

        @@ git-gui.sh: proc rescan {after {honor_trustmtime 1}} {
          }

          proc rescan_stage2 {fd after} {
        -@@ git-gui.sh: proc do_git_gui {} {
        -
        - # Get the system-specific explorer app/command.
        - proc get_explorer {} {
        --	if {[is_Cygwin] || [is_Windows]} {
        -+	if {[is_Windows]} {
        - 		set explorer "explorer.exe"
        - 	} elseif {[is_MacOSX]} {
        - 		set explorer "open"
        -@@ git-gui.sh: if {[is_enabled multicommit]} {
        -
        - 	.mbar.repository add separator
        -
        --	if {[is_Cygwin]} {
        --		.mbar.repository add command \
        --			-label [mc "Create Desktop Icon"] \
        --			-command do_cygwin_shortcut
        --	} elseif {[is_Windows]} {
        -+	if {[is_Windows]} {
        - 		.mbar.repository add command \
        - 			-label [mc "Create Desktop Icon"] \
        - 			-command do_windows_shortcut
         @@ git-gui.sh: if {[is_MacOSX]} {
          set doc_path [githtmldir]
          if {$doc_path ne {}} {
        @@ lib/choose_repository.tcl: method _do_clone2 {} {
          				}
          				close $f_in
          				close $f_cp
        -
        - ## lib/shortcut.tcl ##
        -@@ lib/shortcut.tcl: proc do_windows_shortcut {} {
        - 	}
        - }
        -
        --proc do_cygwin_shortcut {} {
        --	global argv0 _gitworktree
        --
        --	if {[catch {
        --		set desktop [exec cygpath \
        --			--windows \
        --			--absolute \
        --			--long-name \
        --			--desktop]
        --		}]} {
        --			set desktop .
        --	}
        --	set fn [tk_getSaveFile \
        --		-parent . \
        --		-title [mc "%s (%s): Create Desktop Icon" [appname] [reponame]] \
        --		-initialdir $desktop \
        --		-initialfile "Git [reponame].lnk"]
        --	if {$fn != {}} {
        --		if {[file extension $fn] ne {.lnk}} {
        --			set fn ${fn}.lnk
        --		}
        --		if {[catch {
        --				set sh [exec cygpath \
        --					--windows \
        --					--absolute \
        --					/bin/sh.exe]
        --				set me [exec cygpath \
        --					--unix \
        --					--absolute \
        --					$argv0]
        --				win32_create_lnk $fn [list \
        --					$sh -c \
        --					"CHERE_INVOKING=1 source /etc/profile;[sq $me] &" \
        --					] \
        --					[file normalize $_gitworktree]
        --			} err]} {
        --			error_popup [strcat [mc "Cannot write shortcut:"] "\n\n$err"]
        --		}
        --	}
        --}
        --
        - proc do_macosx_app {} {
        - 	global argv0 env
        -
    3:  00000000000 ! 3:  00000000000     git-gui - use cygstart to browse on Cygwin
        @@ Metadata
          ## Commit message ##
             git-gui - use cygstart to browse on Cygwin

        -    Pre-2012, git-gui enabled the "Repository->Explore Working Copy" menu on
        -    Cygwin, offering open a Windows graphical file browser at the root
        -    working directory. The old code relied upon internal use of Windows
        -    pathnames, while git-gui must use unix pathnames on Cygwin since 2012,
        -    so was removed in a previous patch.
        +    git-gui enables the "Repository->Explore Working Copy" menu on Cygwin,
        +    offering to open a Windows graphical file browser at the root of the
        +    working directory. This code, shared with Git For Windows support,
        +    depends upon use of Windows pathnames. However, git gui on Cygwin uses
        +    unix pathnames, so this shared code will not work on Cygwin.

             A base install of Cygwin provides the /bin/cygstart utility that runs
        -    arbtitrary Windows applications after translating unix pathnames to
        -    Windows.  Adding the --explore option guarantees that the Windows file
        -    explorer is opened, regardless of the supplied pathname's file type and
        -    avoiding possibility of some other action being taken.
        +    a registered Windows application based upon the file type, after
        +    translating unix pathnames to Windows.  Adding the --explore option
        +    guarantees that the Windows file explorer is opened, regardless of the
        +    supplied pathname's file type and avoiding possibility of some other
        +    action being taken.

             So, teach git-gui to use cygstart --explore on Cygwin, restoring the
        -    pre-2012 behavior of opening a Windows file explorer for browsing.
        +    pre-2012 behavior of opening a Windows file explorer for browsing. This
        +    separates the Git For Windows and Cygwin code paths. Note that
        +    is_Windows is never true on Cygwin, and is_Cygwin is never true on Git
        +    for Windows, though this is not obvious by examining the code for those
        +    independent functions.

             Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>

          ## git-gui.sh ##
         @@ git-gui.sh: proc do_git_gui {} {
        +
        + # Get the system-specific explorer app/command.
          proc get_explorer {} {
        - 	if {[is_Windows]} {
        - 		set explorer "explorer.exe"
        -+	} elseif {[is_Cygwin]} {
        +-	if {[is_Cygwin] || [is_Windows]} {
        ++	if {[is_Cygwin]} {
         +		set explorer "/bin/cygstart.exe --explore"
        ++	} elseif {[is_Windows]} {
        + 		set explorer "explorer.exe"
          	} elseif {[is_MacOSX]} {
          		set explorer "open"
        - 	} else {
    4:  00000000000 ! 4:  00000000000     git-gui - use mkshortcut on Cygwin
        @@ Metadata
          ## Commit message ##
             git-gui - use mkshortcut on Cygwin

        -    Prior to 2012, git-gui enabled the "Repository->Create Desktop Icon"
        -    item on Cygwin, offering to create a shortcut that starts git-gui on a
        -    particular repository. The original code for this in lib/win32.tcl,
        -    shared with Git for Windows support, requires Windows pathnames, while
        -    git-gui must use unix pathnames with the unix/X11 Tcl/Tk since 2012. The
        -    ability to use this from Cygwin was removed in a previous patch.
        +    git-gui enables the "Repository->Create Desktop Icon" item on Cygwin,
        +    offering to create a shortcut that starts git-gui on the current
        +    repository. The code in do_cygwin_shortcut invokes function
        +    win32_create_lnk to create the shortcut. This latter function is shared
        +    between Cygwin and Git For Windows and expects Windows rather than unix
        +    pathnames, though do_cygwin_shortcut provides unix pathnames. Also, this
        +    function tries to invoke the Windows Script Host to run a javascript
        +    snippet, but this fails under Cygwin's Tcl. So, win32_create_lnk just
        +    does not support Cygwin.

        -    Cygwin's default installation provides /bin/mkshortcut for creating
        -    desktop shortuts, this is compatible with exec under tcl, and understands
        -    Cygwin's unix pathnames. So, teach git-gui to use mkshortcut on Cygwin,
        -    leaving lib/win32.tcl as Git for Windows specific support.
        +    However, Cygwin's default installation provides /bin/mkshortcut for
        +    creating desktop shortcuts. This is compatible with exec under Cygwin's
        +    Tcl, understands Cygwin's unix pathnames, and avoids the need for shell
        +    escapes to encode troublesome paths. So, teach git-gui to use mkshortcut
        +    on Cygwin, leaving win32_create_lnk unchanged and for exclusive use by
        +    Git For Windows.

             Notes: "CHERE_INVOKING=1" is recognized by Cygwin's /etc/profile and
             prevents a "chdir $HOME", leaving the shell in the working directory
             specified by the shortcut. That directory is written directly by
             mkshortcut eliminating any problems with shell escapes and quoting.

        -    The pre-2012 code includes the full pathname of the git-gui creating the
        -    shortcut (rather than using the system git-gui), but that git-gui might
        -    not be compatible with the git found after /etc/profile sets the path,
        -    and might have a pathname that defies encoding using shell escapes that
        -    can survive the multiple incompatible interpreters involved in this
        -    chain. Instead, use "git gui", thus defaulting to the system git and
        +    The code being replaced includes the full pathname of the git-gui
        +    creating the shortcut, but that git-gui might not be compatible with the
        +    git found after /etc/profile sets the path, and might have a pathname
        +    that defies encoding using shell escapes that can survive the multiple
        +    incompatible interpreters involved in the chain of creating and using
        +    this shortcut.  The new code uses bare "git gui" as the command to
        +    execute, thus using the system git to launch the system git-gui, and
             avoiding both issues.

             Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>

        - ## git-gui.sh ##
        -@@ git-gui.sh: if {[is_enabled multicommit]} {
        - 		.mbar.repository add command \
        - 			-label [mc "Create Desktop Icon"] \
        - 			-command do_windows_shortcut
        -+	} elseif {[is_Cygwin]} {
        -+		.mbar.repository add command \
        -+			-label [mc "Create Desktop Icon"] \
        -+			-command do_cygwin_shortcut
        - 	} elseif {[is_MacOSX]} {
        - 		.mbar.repository add command \
        - 			-label [mc "Create Desktop Icon"] \
        -
          ## lib/shortcut.tcl ##
         @@ lib/shortcut.tcl: proc do_windows_shortcut {} {
        - 	}
          }

        -+proc do_cygwin_shortcut {} {
        + proc do_cygwin_shortcut {} {
        +-	global argv0 _gitworktree
         +	global argv0 _gitworktree oguilib
        -+
        -+	if {[catch {
        -+		set desktop [exec cygpath \
        -+			--desktop]
        -+		}]} {
        -+			set desktop .
        -+	}
        -+	set fn [tk_getSaveFile \
        -+		-parent . \
        -+		-title [mc "%s (%s): Create Desktop Icon" [appname] [reponame]] \
        -+		-initialdir $desktop \
        -+		-initialfile "Git [reponame].lnk"]
        -+	if {$fn != {}} {
        -+		if {[file extension $fn] ne {.lnk}} {
        -+			set fn ${fn}.lnk
        -+		}
        -+		if {[catch {
        +
        + 	if {[catch {
        + 		set desktop [exec cygpath \
        +-			--windows \
        +-			--absolute \
        +-			--long-name \
        + 			--desktop]
        + 		}]} {
        + 			set desktop .
        +@@ lib/shortcut.tcl: proc do_cygwin_shortcut {} {
        + 			set fn ${fn}.lnk
        + 		}
        + 		if {[catch {
        +-				set sh [exec cygpath \
        +-					--windows \
        +-					--absolute \
        +-					/bin/sh.exe]
        +-				set me [exec cygpath \
        +-					--unix \
        +-					--absolute \
        +-					$argv0]
        +-				win32_create_lnk $fn [list \
        +-					$sh -c \
        +-					"CHERE_INVOKING=1 source /etc/profile;[sq $me] &" \
        +-					] \
        +-					[file normalize $_gitworktree]
         +				set repodir [file normalize $_gitworktree]
         +				set shargs {-c \
         +					"CHERE_INVOKING=1 \
         +					source /etc/profile; \
         +					git gui"}
         +				exec /bin/mkshortcut.exe \
        -+					-a $shargs \
        -+					-d "git-gui on $repodir" \
        -+					-i $oguilib/git-gui.ico \
        -+					-n $fn \
        -+					-s min \
        -+					-w $repodir \
        ++					--arguments $shargs \
        ++					--desc "git-gui on $repodir" \
        ++					--icon $oguilib/git-gui.ico \
        ++					--name $fn \
        ++					--show min \
        ++					--workingdir $repodir \
         +					/bin/sh.exe
        -+			} err]} {
        -+			error_popup [strcat [mc "Cannot write shortcut:"] "\n\n$err"]
        -+		}
        -+	}
        -+}
        -+
        - proc do_macosx_app {} {
        - 	global argv0 env
        -
        + 			} err]} {
        + 			error_popup [strcat [mc "Cannot write shortcut:"] "\n\n$err"]
        + 		}

FWIW even v1 looked good to me (I don't care about typos as long as they
don't change meaning).

I tested the changes on top of Git for Windows and everything works as
expected (if you want to cross check my work, look no further than the
`git-gui-cygwin` branch at https://github.com/dscho/git).

If you want, please feel free to add

	Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>

Thank you!
Johannes

^ permalink raw reply

* Re: [PATCH 2/2] for-each-ref: add --count-matches option
From: Phillip Wood @ 2023-06-27 10:05 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: Junio C Hamano, git, vdye, me, mjcheetham, Derrick Stolee
In-Reply-To: <20230627073007.GD1226768@coredump.intra.peff.net>

On 27/06/2023 08:30, Jeff King wrote:
> On Mon, Jun 26, 2023 at 03:09:57PM +0000, Derrick Stolee via GitGitGadget wrote:
> 
>> +for pattern in "refs/heads/" "refs/tags/" "refs/remotes"
>> +do
>> +	test_perf "count $pattern: git for-each-ref | wc -l" "
>> +		git for-each-ref $pattern | wc -l
>> +	"
>> +
>> +	test_perf "count $pattern: git for-each-ref --count-match" "
>> +		git for-each-ref --count-matches $pattern
>> +	"
>> +done
> 
> I don't think this is a very realistic perf test, because for-each-ref
> is doing a bunch of work to generate its default format, only to have
> "wc" throw most of it away. Doing:
> 
>    git for-each-ref --format='%(refname)' | wc -l

That's a good point. I wondered if using a short fixed format string was 
even better so I tried

git init test
cd test
git commit --allow-empty -m initial
seq 0 100000 | sed "s:\(.*\):create refs/heads/some-prefix/\1 $(git 
rev-parse HEAD):" | git update-ref --stdin
git pack-refs --all
hyperfine -L fmt "","--format=%\(refname\)","--format=x" 'git 
for-each-ref {fmt} refs/heads/ | wc -l'

Which gives

Benchmark 1: git for-each-ref  refs/heads/ | wc -l
   Time (mean ± σ):      1.150 s ±  0.010 s    [User: 0.494 s, System: 
0.637 s]
   Range (min … max):    1.140 s …  1.170 s    10 runs

Benchmark 2: git for-each-ref --format=%\(refname\) refs/heads/ | wc -l
   Time (mean ± σ):      66.0 ms ±   0.3 ms    [User: 58.9 ms, System: 
9.5 ms]
   Range (min … max):    65.2 ms …  67.1 ms    43 runs

Benchmark 3: git for-each-ref --format=x refs/heads/ | wc -l
   Time (mean ± σ):      63.0 ms ±   0.5 ms    [User: 54.3 ms, System: 
9.6 ms]
   Range (min … max):    62.3 ms …  65.4 ms    44 runs

Summary
   git for-each-ref --format=x refs/heads/ | wc -l ran
     1.05 ± 0.01 times faster than git for-each-ref 
--format=%\(refname\) refs/heads/ | wc -l
    18.25 ± 0.20 times faster than git for-each-ref  refs/heads/ | wc -l

So on my somewhat slower machine the default format is over an order of 
magnitude slower than using either --format=%(refname) or --format=x and 
the short fixed format is marginally faster. I haven't applied stolee's 
patch but the 3 or 4 times improvement mentioned in the commit message 
seems likely to be from not processing the default format. One thing to 
note is that we're not comparing like-with-like when more than one 
pattern is given as --count-matches gives a separate count for each pattern.

I'm a bit suspicious of the massive speed up I'm seeing by avoiding the 
default format but it appears to be repeatable.

Best Wishes

Phillip

> is much better (obviously you have to remember to do that if you care
> about optimizing your command, but that's true of --count-matches, too).
> 
> Running hyperfine with three variants shows that the command above is
> competitive with --count-matches, though slightly slower (hyperfine
> complains about short commands and outliers because these runtimes are
> so tiny in the first place; I omitted those warnings from the output
> below for readability):
> 
>    Benchmark 1: ./git-for-each-ref refs/remotes/ | wc -l
>      Time (mean ± σ):       6.1 ms ±   0.2 ms    [User: 3.0 ms, System: 3.6 ms]
>      Range (min … max):     5.6 ms …   7.1 ms    397 runs
>    
>    Benchmark 2: ./git-for-each-ref --format="%(refname)" refs/remotes/ | wc -l
>      Time (mean ± σ):       3.3 ms ±   0.2 ms    [User: 2.2 ms, System: 1.5 ms]
>      Range (min … max):     3.0 ms …   4.0 ms    774 runs
>    
>    Benchmark 3: ./git-for-each-ref --count-matches refs/remotes/
>      Time (mean ± σ):       2.4 ms ±   0.1 ms    [User: 1.5 ms, System: 0.9 ms]
>      Range (min … max):     2.2 ms …   3.4 ms    1018 runs
>    
>    Summary
>      './git-for-each-ref --count-matches refs/remotes/' ran
>        1.33 ± 0.10 times faster than './git-for-each-ref --format="%(refname)" refs/remotes/ | wc -l'
>        2.48 ± 0.17 times faster than './git-for-each-ref refs/remotes/ | wc -l'
> 
> I will note this is an unloaded multi-core system, which gives the piped
> version a slight edge. Total CPU is probably more interesting than
> wall-clock time, but all of these are so short that I think the results
> should be taken with a pretty big grain of salt (I had to switch from
> the "powersave" to "performance" CPU governor just to get more
> consistent results).
> 
> -Peff

^ permalink raw reply

* Re: [PATCH 5/5] strbuf: simplify strbuf_expand_literal_cb()
From: Jeff King @ 2023-06-27  8:57 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List
In-Reply-To: <ded69969-158d-b05f-fdd4-91b26e9b502b@web.de>

On Sat, Jun 17, 2023 at 10:44:00PM +0200, René Scharfe wrote:

> Now that strbuf_expand_literal_cb() is no longer used as a callback,
> drop its "_cb" name suffix and unused context parameter.

Makes sense.

Since most callers just call "format += len", it kind of feels like the
appropriate interface might be more like:

  strbuf_expand_literal(&sb, &format);

to auto-advance the format. But I guess that gets weird with this
caller:

> @@ -1395,7 +1395,7 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
>  	char **slot;
> 
>  	/* these are independent of the commit */
> -	res = strbuf_expand_literal_cb(sb, placeholder, NULL);
> +	res = strbuf_expand_literal(sb, placeholder);
>  	if (res)
>  		return res;

which is still in the "return the length" mentality (OTOH, if it
advanced the local copy of the placeholder pointer nobody would mind).

I dunno. It is a little thing, and I am OK with it either way (I would
not even think of changing it if you were not already touching the
function).

-Peff

^ permalink raw reply

* Re: [PATCH 4/5] replace strbuf_expand() with strbuf_expand_step()
From: Jeff King @ 2023-06-27  8:54 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List
In-Reply-To: <150df0b0-0548-f291-2b68-960841dd1c97@web.de>

On Sat, Jun 17, 2023 at 10:43:17PM +0200, René Scharfe wrote:

> Avoid the overhead of passing context to a callback function of
> strbuf_expand() by using strbuf_expand_step() in a loop instead.  It
> requires explicit handling of %% and unrecognized placeholders, but is
> simpler, more direct and avoids void pointers.

I like this. I don't care that much about the runtime overhead of
passing the context around, but if you meant by "overhead" the
programmer time and confusion in stuffing everything into context
structs, then I agree this is much better. :)

It does still feel like we should be handling "%%" on behalf of the
callers.

>  builtin/cat-file.c |  35 +++++++--------
>  builtin/ls-files.c | 109 +++++++++++++++++++--------------------------
>  builtin/ls-tree.c  | 107 +++++++++++++++++---------------------------
>  daemon.c           |  61 ++++++++-----------------
>  pretty.c           |  72 ++++++++++++++++++------------
>  strbuf.c           |  20 ---------
>  strbuf.h           |  37 ++-------------
>  7 files changed, 169 insertions(+), 272 deletions(-)

The changes mostly looked OK to me (and the diffstat is certainly
pleasing). The old callbacks returned a "consumed" length, and we need
each "step" caller to now do "format += consumed" themselves. At first
glance I thought there were cases where you didn't, but then I realized
that you are relying on skip_prefix() to do that incrementing. Which
makes sense and the resulting code looks nice, but it took me a minute
to realize what was going on.

> @@ -1894,7 +1880,26 @@ void userformat_find_requirements(const char *fmt, struct userformat_want *w)
>  			return;
>  		fmt = user_format;
>  	}
> -	strbuf_expand(&dummy, fmt, userformat_want_item, w);
> +	while (strbuf_expand_step(&dummy, &fmt)) {
> +		if (skip_prefix(fmt, "%", &fmt))
> +			continue;
> +
> +		if (*fmt == '+' || *fmt == '-' || *fmt == ' ')
> +			fmt++;
> +
> +		switch (*fmt) {
> +		case 'N':
> +			w->notes = 1;
> +			break;
> +		case 'S':
> +			w->source = 1;
> +			break;
> +		case 'd':
> +		case 'D':
> +			w->decorate = 1;
> +			break;
> +		}
> +	}
>  	strbuf_release(&dummy);
>  }

This one actually doesn't increment the format (so we restart the
expansion on "N" or whatever). But neither did the original! It always
returned 0:

> -static size_t userformat_want_item(struct strbuf *sb UNUSED,
> -				   const char *placeholder,
> -				   void *context)
> -{
> -	struct userformat_want *w = context;
> -
> -	if (*placeholder == '+' || *placeholder == '-' || *placeholder == ' ')
> -		placeholder++;
> -
> -	switch (*placeholder) {
> -	case 'N':
> -		w->notes = 1;
> -		break;
> -	case 'S':
> -		w->source = 1;
> -		break;
> -	case 'd':
> -	case 'D':
> -		w->decorate = 1;
> -		break;
> -	}
> -	return 0;
> -}

So probably OK, though a little funny.

It also feels like this whole function would be just as happy using
"strchr()", since it throws away the expanded result anyway. But that
can be for another time. :)

> @@ -1912,7 +1917,16 @@ void repo_format_commit_message(struct repository *r,
>  	const char *output_enc = pretty_ctx->output_encoding;
>  	const char *utf8 = "UTF-8";
> 
> -	strbuf_expand(sb, format, format_commit_item, &context);
> +	while (strbuf_expand_step(sb, &format)) {
> +		size_t len;
> +
> +		if (skip_prefix(format, "%", &format))
> +			strbuf_addch(sb, '%');
> +		else if ((len = format_commit_item(sb, format, &context)))
> +			format += len;
> +		else
> +			strbuf_addch(sb, '%');
> +	}

This one doesn't advance the format for a not-understood placeholder.
But that's OK, because we know it isn't "%", so starting the search from
there again is correct.

> @@ -1842,7 +1852,7 @@ static size_t format_commit_item(struct strbuf *sb, /* in UTF-8 */
>  	}
> 
>  	orig_len = sb->len;
> -	if (((struct format_commit_context *)context)->flush_type != no_flush)
> +	if ((context)->flush_type != no_flush)
>  		consumed = format_and_pad_commit(sb, placeholder, context);
>  	else
>  		consumed = format_commit_one(sb, placeholder, context);

Since we're no longer casting, the extra parentheses seem redundant now.
I.e., this can just be context->flush_type.

-Peff

^ permalink raw reply

* Re: [PATCH 3/5] replace strbuf_expand_dict_cb() with strbuf_expand_step()
From: Jeff King @ 2023-06-27  8:35 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List
In-Reply-To: <20230627082902.GI1226768@coredump.intra.peff.net>

On Tue, Jun 27, 2023 at 04:29:02AM -0400, Jeff King wrote:

> Your comment above does make me wonder if strbuf_expand_step() should be
> quietly handling "%%" itself. But I guess strbuf_expand() doesn't, and
> your branch.c quote-literal example probably would not want that
> behavior.

Er, nope, strbuf_expand() does handle "%%" itself. It really feels like
we'd want strbuf_expand_step() to do so, too, then. Even if we had two
variants (a "raw" one which doesn't handle "%%" so that branch.c could
use it, and then another that wrapped it to handle "%%").

-Peff

^ permalink raw reply

* Re: [PATCH 3/5] replace strbuf_expand_dict_cb() with strbuf_expand_step()
From: Jeff King @ 2023-06-27  8:29 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List
In-Reply-To: <5ce9513b-d463-6f62-db4e-f9f60a7c3e9f@web.de>

On Sat, Jun 17, 2023 at 10:42:26PM +0200, René Scharfe wrote:

> Avoid the overhead of setting up a dictionary and passing it via
> strbuf_expand() to strbuf_expand_dict_cb() by using strbuf_expand_step()
> in a loop instead.  It requires explicit handling of %% and unrecognized
> placeholders, but is more direct and simpler overall, and expands only
> on demand.

Great. I think even replacing the dictionary with regular
strbuf_expand() would be an improvement in terms of the on-demand
loading. But I am happy to see it go all the way to the iterative
version. :)

Your comment above does make me wonder if strbuf_expand_step() should be
quietly handling "%%" itself. But I guess strbuf_expand() doesn't, and
your branch.c quote-literal example probably would not want that
behavior.

-Peff

^ permalink raw reply

* Re: [PATCH 2/5] strbuf: factor out strbuf_expand_step()
From: Jeff King @ 2023-06-27  8:26 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List
In-Reply-To: <caf8e100-1308-cb4e-d61a-4e15ee3f47f7@web.de>

On Sat, Jun 17, 2023 at 10:41:44PM +0200, René Scharfe wrote:

> Extract the part of strbuf_expand that finds the next placeholder into a
> new function.  It allows to build parsers without callback functions and
> the overhead imposed by them.
> 
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  builtin/branch.c | 13 ++-----------
>  strbuf.c         | 28 ++++++++++++++--------------
>  strbuf.h         |  8 ++++++++
>  3 files changed, 24 insertions(+), 25 deletions(-)

I was a little surprised to see the change in branch.c here (as well as
the one in strbuf_addftime), since the commit message just says
"extract..." and doesn't mention them.

They do serve as examples of how it can be used, so I think it's OK. But
it made me wonder: is this all of the sites? Or are these just examples?

-Peff

^ permalink raw reply

* Re: [PATCH v2] doc: gitcredentials: link to helper list
From: Jeff King @ 2023-06-27  8:21 UTC (permalink / raw)
  To: M Hickford via GitGitGadget
  Cc: git, msuchanek, sandals, lessleydennington, me, mjcheetham,
	M Hickford
In-Reply-To: <pull.1538.v2.git.1687332624780.gitgitgadget@gmail.com>

On Wed, Jun 21, 2023 at 07:30:24AM +0000, M Hickford via GitGitGadget wrote:

> From: M Hickford <mirth.hickford@gmail.com>
> 
> Link to community list of credential helpers. This is useful information
> for users.
> 
> Describe how OAuth credential helpers work. OAuth is a user-friendly
> alternative to personal access tokens and SSH keys. Reduced setup cost
> makes it easier for users to contribute to projects across multiple
> forges.

Kind of seems like two topics in one patch, but OK.

I don't have much of an opinion on either topic, but...

> diff --git a/Documentation/gitcredentials.txt b/Documentation/gitcredentials.txt
> index 100f045bb1a..a266870a042 100644
> --- a/Documentation/gitcredentials.txt
> +++ b/Documentation/gitcredentials.txt
> @@ -104,6 +104,18 @@ $ git help credential-foo
>  $ git config --global credential.helper foo
>  -------------------------------------------
>  
> +=== Available helpers
> +
> +The community maintains a comprehensive
> +https://git-scm.com/doc/credential-helpers[list of Git credential helpers]
> +available.

I'd note that full hyperlinks like this are kind of lousy in the manpage
builds. You get:

     Available helpers
	The community maintains a comprehensive list of Git credential
	helpers[1] available.

in the text, and then way down at the bottom of the manpage:

  NOTES
	1. list of Git credential helpers
	   https://git-scm.com/doc/credential-helpers

Something like:

diff --git a/Documentation/gitcredentials.txt b/Documentation/gitcredentials.txt
index fd5ecede13..1c7d302f18 100644
--- a/Documentation/gitcredentials.txt
+++ b/Documentation/gitcredentials.txt
@@ -106,9 +106,8 @@ $ git config --global credential.helper foo
 
 === Available helpers
 
-The community maintains a comprehensive
-https://git-scm.com/doc/credential-helpers[list of Git credential helpers]
-available.
+The community maintains a comprehensive list of Git credential helpers
+at https://git-scm.com/doc/credential-helpers.
 
 === OAuth
 

yields nicer text in the manpage, and asciidoc is smart enough to turn
it into a hyperlink in the html version.

-Peff

^ permalink raw reply related

* Re: Determining whether you have a commit locally, in a partial clone?
From: Jeff King @ 2023-06-27  8:09 UTC (permalink / raw)
  To: Tao Klerks; +Cc: git
In-Reply-To: <CAPMMpoha6rBA-T-7cn3DQT_nbNfknigLTky55x0TEmt4Ay2GRA@mail.gmail.com>

On Wed, Jun 21, 2023 at 12:10:33PM +0200, Tao Klerks wrote:

> > This is not very efficient, but:
> >
> >   git cat-file --batch-check='%(objectname)' --batch-all-objects --unordered |
> >   grep $some_sha1
> >
> > will tell you whether we have the object locally.
> >
> 
> Thanks so much for your help!
> 
> in Windows (msys or git bash) this is still very slow in my repo with
> 6,500,000 local objects - around 60s - but in linux on the same repo
> it's quite a lot faster, at 5s. A large proportion of my users are on
> Windows though, so I don't think this will be "good enough" for my
> purposes, when I often need to check for the existence of dozens or
> even hundreds of commits.

Yeah, it's just a lot of object names to print, most of which you don't
care about. :)

The more efficient thing would be to open the actual pack .idx files and
look for the names via binary search. I don't think you can convince git
to do that, though I suspect you could write a trivial libgit2 program
that does.

> > I don't work with partial clones often, but it feels like being able to
> > say:
> >
> >   git --no-partial-fetch cat-file ...
> >
> > would be a useful primitive to have.
> 
> It feels that way to me, yes!
> 
> On the other hand, I find very little demand for it when I search "the
> internet" - or I don't know how to search for it.

I think partial clones are still new enough that not many people are
using them heavily. And when they do, not managing the partial state at
a very advanced level; I think tools for pruning locally cached objects
(which you could refetch) is only just being worked on now.

> > It does seem like you might be able to bend it to
> > your will here, though. I think without any patches that:
> >
> >   git rev-list --objects --exclude-promisor-objects $oid
> >
> > will tell you whether we have the object or not (since it turns off
> > fetch_if_missing, and thus will either succeed, printing nothing, or
> > bail if the object can't be found).
> 
> This behaves in a way that I don't understand:
> 
> In the repo that I'm working in, this command runs successfully
> *without fetching*, but it takes a *very* long time - 300+ seconds -
> much longer than even the "inefficient" 'cat-file'-based printing of
> all (6.5M) local object ids that you proposed above. I haven't
> attempted to understand what's going on in there (besides running with
> GIT_TRACE2_PERF, which showed nothing interesting), but the idea that
> git would have to work super-hard to find an object by its ID seems
> counter to everything I know about it. Would there be value in my
> trying to understand & reproduce this in a shareable repo, or is there
> already an explanation as to why this command could/should ever do
> non-trivial work, even in the largest partial repos?

I think it's actually doing the gigantic traversal (and just limiting it
when it sees objects that are not available). You probably want
"--no-walk" at least, but really you don't even want to walk the trees
of any commits you specify (so you'd want to omit "--objects" if you are
asking about a commit, and otherwise include it, which is slightly
awkward).

> > It feels like --missing=error should
> > function similarly, but it seems to still lazy-fetch (I guess since it's
> > the default, the point is to just find truly unavailable objects). Using
> > --missing=print disables the lazy-fetch, but it seems to bail
> > immediately if you ask it about a missing object (I didn't dig, but my
> > guess is that --missing is mostly about objects we traverse, not the
> > initial tips).
> 
> Woah, "--missing=print" seems to work!!!
> 
> The following gives me the commit hash if I have it locally, and an
> error otherwise - consistently across linux and windows, git versions
> 2.41, 2.39, 2.38, and 2.36 - without fetching, and without crazy
> CPU-churning:
> 
> git rev-list --missing=print -1 $oid
> 
> Thank you thank you thank you!

Hmph, I thought I tried that before and it didn't work, but it seems to
work for me now. I guess I was hoping to have it print the missing
object rather than exiting with an error, but if you do one object at a
time then the error is sufficient signal. :)

You might want "--objects" if you're going to ask about non-commits.
Though it might not be necessary. I suspect Git would bail trying to
look up the object in the first place if we don't have it, and if we do
have it then it just becomes a silent noop.

> I feel like I should try to work something into the doc about this,
> but I'm not sure how to express this: "--missing=error is the default,
> but it doesn't actually error out when you're explicitly asking about
> a missing commit, it fetches it instead - but --missing=print actually
> *does* error out if you explicitly ask about a missing commit" seems
> like a strange thing to be saying.

I think we are relying on the side effect that everything except
--missing=error will turn off auto-fetching. I don't know if that's
something we'd want to document. It seems reasonable to me that we might
later change the implementation so that we kick in the --missing
behavior only after parsing the initial list of traversal tips (I mean,
I don't know why we would do that in particular, but it seems like the
kind of thing we'd want to reserve as an implementation detail subject
to change).

I do think in the long run that a big "--do-not-lazy-fetch" flag would
be the right solution to let the user tell us what they want.

> Thanks again for finding me an efficient working strategy here!

I'm glad it worked. I was mostly just thinking out loud. ;)

-Peff

^ permalink raw reply

* Re: [PATCH] repack: only repack .packs that exist
From: Jeff King @ 2023-06-27  7:54 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, gitster, me, Derrick Stolee
In-Reply-To: <pull.1546.git.1687287782439.gitgitgadget@gmail.com>

On Tue, Jun 20, 2023 at 07:03:02PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
> 
> In 73320e49add (builtin/repack.c: only collect fully-formed packs,
> 2023-06-07), we switched the check for which packs to collect by
> starting at the .idx files and looking for matching .pack files. This
> avoids trying to repack pack-files that have not had their pack-indexes
> installed yet.
> 
> However, it does cause maintenance to halt if we find the (problematic,
> but not insurmountable) case of a .idx file without a corresponding
> .pack file. In an environment where packfile maintenance is a critical
> function, such a hard stop is costly and requires human intervention to
> resolve (by deleting the .idx file).
> 
> This was not the case before. We successfully repacked through this
> scenario until the recent change to scan for .idx files.
> 
> Further, if we are actually in a case where objects are missing, we
> detect this at a different point during the reachability walk.

I'd worry less about missing objects here, and more that we could be in
a race in which git-repack sees some partial state and decides to take
some action that causes harm. For example, in a "git repack -ad", could
we later mistakenly delete a .idx file because we thought it had no
matching pack? The repack itself would be happy, but it might be causing
headaches for later processes.

I _think_ the answer is "no", because any file we skip via
collect_pack_filenames() won't make it into the list of either the
existing nonkept or kept packs, and we use "nonkept" list to do the
deletion.

So it seems like missing a pack here (whether racily or not) could at
most result in cruft being left in place, or a bad --geometric decision
being made, etc. Which makes the stakes reasonably low, I think.

> In other cases, Git prepares its list of packfiles by scanning .idx
> files and then only adds it to the packfile list if the corresponding
> .pack file exists. It even does so without a warning! (See
> add_packed_git() in packfile.c for details.)

Interesting. I'd have expected a warning. ;)

I also kind of wonder if this repack code should simply be loading and
iterating the packed_git list, but that is a much bigger change.

> This case is much less likely to occur than the failures seen before
> 73320e49add. Packfiles are "installed" by writing the .pack file before
> the .idx and that process can be interrupted. Packfiles _should_ be
> deleted by deleting the .idx first, followed by the .pack file, but
> unlink_pack_path() does not do this: it deletes the .pack _first_,
> allowing a window where this process could be interrupted. We leave the
> consideration of changing this order as a separate concern. Knowing that
> this condition is possible from interrupted Git processes and not other
> tools lends some weight that Git should be more flexible around this
> scenario.

Hmm. So it really sounds to me like unlink_pack_path() is doing the
wrong thing. And once fixed, we would (assuming no other similar spots)
cease to see any races with this code. At which point would we still
want the looseness provided by this patch?

I.e., is the concern concurrent races, or is it leftover cruft (possibly
from a killed process, crash, etc)?

If the latter, then we might still want to quietly skip them. But it
makes me wonder how such .idx files should get cleaned up in the longer
term.

I'm kind of guessing that what you've seen is actually leftover cruft,
if only because I wouldn't expect concurrent unlinking to be very common
(it implies two repacks running at the same time, which is generally a
bad idea). Whereas new packs being written would be common (via
index-pack, etc from pushes).

> Add a check to see if the .pack file exists before adding it to the list
> for repacking. This will stop a number of maintenance failures seen in
> production but fixed by deleting the .idx files.

OK. The fact that this brings things in line with add_packed_git() is
compelling to me, and I think the stakes are pretty low for this causing
us to do something confusing.

It does make me feel like we should be doing something similar to
git-prune's prune_cruft(): any non-.pack file in objects/pack with an
old mtime and no matching .pack file should be a candidate for deletion.
I don't _think_ we have anything like that now, but it would clean these
in the longer term. I'm OK with that being a separate topic, though.

-Peff

^ permalink raw reply

* Re: [PATCH 2/2] for-each-ref: add --count-matches option
From: Jeff King @ 2023-06-27  7:30 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Junio C Hamano, git, vdye, me, mjcheetham, Derrick Stolee
In-Reply-To: <9121e027fb9f157878a9624ce6c834b69cd38472.1687792197.git.gitgitgadget@gmail.com>

On Mon, Jun 26, 2023 at 03:09:57PM +0000, Derrick Stolee via GitGitGadget wrote:

> +for pattern in "refs/heads/" "refs/tags/" "refs/remotes"
> +do
> +	test_perf "count $pattern: git for-each-ref | wc -l" "
> +		git for-each-ref $pattern | wc -l
> +	"
> +
> +	test_perf "count $pattern: git for-each-ref --count-match" "
> +		git for-each-ref --count-matches $pattern
> +	"
> +done

I don't think this is a very realistic perf test, because for-each-ref
is doing a bunch of work to generate its default format, only to have
"wc" throw most of it away. Doing:

  git for-each-ref --format='%(refname)' | wc -l

is much better (obviously you have to remember to do that if you care
about optimizing your command, but that's true of --count-matches, too).

Running hyperfine with three variants shows that the command above is
competitive with --count-matches, though slightly slower (hyperfine
complains about short commands and outliers because these runtimes are
so tiny in the first place; I omitted those warnings from the output
below for readability):

  Benchmark 1: ./git-for-each-ref refs/remotes/ | wc -l
    Time (mean ± σ):       6.1 ms ±   0.2 ms    [User: 3.0 ms, System: 3.6 ms]
    Range (min … max):     5.6 ms …   7.1 ms    397 runs
  
  Benchmark 2: ./git-for-each-ref --format="%(refname)" refs/remotes/ | wc -l
    Time (mean ± σ):       3.3 ms ±   0.2 ms    [User: 2.2 ms, System: 1.5 ms]
    Range (min … max):     3.0 ms …   4.0 ms    774 runs
  
  Benchmark 3: ./git-for-each-ref --count-matches refs/remotes/
    Time (mean ± σ):       2.4 ms ±   0.1 ms    [User: 1.5 ms, System: 0.9 ms]
    Range (min … max):     2.2 ms …   3.4 ms    1018 runs
  
  Summary
    './git-for-each-ref --count-matches refs/remotes/' ran
      1.33 ± 0.10 times faster than './git-for-each-ref --format="%(refname)" refs/remotes/ | wc -l'
      2.48 ± 0.17 times faster than './git-for-each-ref refs/remotes/ | wc -l'

I will note this is an unloaded multi-core system, which gives the piped
version a slight edge. Total CPU is probably more interesting than
wall-clock time, but all of these are so short that I think the results
should be taken with a pretty big grain of salt (I had to switch from
the "powersave" to "performance" CPU governor just to get more
consistent results).

-Peff

^ permalink raw reply

* Re: [PATCH v2] t7701: make annotated tag unreachable
From: Jeff King @ 2023-06-27  7:09 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Junio C Hamano
In-Reply-To: <259b1b559114ab1a9a0bd7f1ad29a4cba2612ae0.1687617197.git.me@ttaylorr.com>

On Sat, Jun 24, 2023 at 10:33:47AM -0400, Taylor Blau wrote:

> In 4dc16e2cb0 (gc: introduce `gc.recentObjectsHook`, 2023-06-07), we
> added tests to ensure that prune-able (i.e. unreachable and with mtime
> older than the cutoff) objects which are marked as recent via the new
> `gc.recentObjectsHook` configuration are unpacked as loose with
> `--unpack-unreachable`.
> 
> In that test, we also ensure that objects which are reachable from other
> unreachable objects which were *not* pruned are kept as well, regardless
> of their mtimes. For this, we use an annotated tag pointing at a blob
> ($obj2) which would otherwise be pruned.
> 
> But after pruning, that object is kept around for two reasons. One, the
> tag object's mtime wasn't adjusted to be beyond the 1-hour cutoff, so it
> would be kept as due to its recency regardless. The other reason is
> because the tag itself is reachable.
> 
> Use mktag to write the tag object directly without pointing a reference
> at it, and adjust the mtime of the tag object to be older than the
> cutoff to ensure that our `gc.recentObjectsHook` configuration is
> working as intended.
> 
> Noticed-by: Jeff King <peff@peff.net>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> Fixes a trivial oversight from an earlier version of this patch noticed
> by Peff.

Thanks, this looks great to me.

-Peff

^ permalink raw reply

* Re: [PATCH v2 6/7] var: add attributes files locations
From: Jeff King @ 2023-06-27  7:05 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git, Junio C Hamano, Eric Sunshine, Derrick Stolee
In-Reply-To: <20230626190008.644769-7-sandals@crustytoothpaste.net>

On Mon, Jun 26, 2023 at 07:00:07PM +0000, brian m. carlson wrote:

> @@ -51,6 +52,26 @@ static char *shell_path(int flag)
>  	return xstrdup(SHELL_PATH);
>  }
>  
> +static char *git_attr_val_system(int flag)
> +{
> +	if (git_attr_system_is_enabled()) {
> +		char *file = xstrdup(git_attr_system_file());
> +		normalize_path_copy(file, file);
> +		return file;
> +	}
> +	return NULL;
> +}

These new ones would ideally mark the "flag" variable with the UNUSED
attribute (in preparation for building with -Wunused-parameter).

I can also come through later and fix them up in a separate patch. It's
slightly awkward, just because I was about to post a patch that fixed
the existing functions in that file, and I'd have to either rebase on
top, or make a second pass once this is merged.

That said, I also renamed the "flag" variable in my patch because it's
super confusing (see my patch below for reference). So adjusting your
new callers to match (without my changes) would be a little weird. The
least-weird thing would be sticking my patch at the front of your
series, but I don't want to make you do extra work.

So I dunno. I'm mostly giving a heads-up, and seeing if you (or other
reviewers in the thread) have an idea to make this "flag" thing less
awful. I'm also happy to just do my topic separately, and then
eventually circle back after yours is merged.

-- >8 --
Subject: [PATCH] var: mark unused parameters in git_var callbacks

We abstract the set of variables into a table, with a "read" callback to
provide the value of each. Each callback takes a "flag" argument, but
most callbacks don't make use of it.

This flag is a bit odd. It may be set to IDENT_STRICT, which make sense
for ident-based callbacks, but is just confusing for things like
GIT_EDITOR.

At first glance, it seems like this is just a hack to let us directly
stick the generic git_committer_info() and git_author_info() functions
into our table. And we'd be better off to wrap them with local functions
which pass IDENT_STRICT, and have our callbacks take no option at all.

But that doesn't quite work. We pass IDENT_STRICT when the caller asks
for a specific variable, but otherwise do not (so that "git var -l" does
not bail if the committer ident cannot be formed).

So we really do need to pass in the flag to each invocation, even if the
individual callback doesn't care about it. Let's mark the unused ones so
that -Wunused-parameter does not complain. And while we're here, let's
rename them so that it's clear that the flag values we get will be from
the IDENT_* set. That may prevent confusion for future readers of the
code.

Another option would be to define our own local "strict" flag for the
callbacks, and then have wrappers that translate that to IDENT_STRICT
where it matters. But that would be more boilerplate for little gain
(most functions would still ignore the "strict" flag anyway).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/var.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/var.c b/builtin/var.c
index 2149998980..10ee62f84c 100644
--- a/builtin/var.c
+++ b/builtin/var.c
@@ -12,17 +12,17 @@
 
 static const char var_usage[] = "git var (-l | <variable>)";
 
-static const char *editor(int flag)
+static const char *editor(int ident_flag UNUSED)
 {
 	return git_editor();
 }
 
-static const char *sequence_editor(int flag)
+static const char *sequence_editor(int ident_flag UNUSED)
 {
 	return git_sequence_editor();
 }
 
-static const char *pager(int flag)
+static const char *pager(int ident_flag UNUSED)
 {
 	const char *pgm = git_pager(1);
 
@@ -31,7 +31,7 @@ static const char *pager(int flag)
 	return pgm;
 }
 
-static const char *default_branch(int flag)
+static const char *default_branch(int ident_flag UNUSED)
 {
 	return git_default_branch_name(1);
 }
-- 
2.41.0.490.g2678ffb796


^ permalink raw reply related

* Re: Clean up stale .gitignore and .gitattribute patterns
From: Sebastian Schuberth @ 2023-06-27  6:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <20230627065103.GA1226768@coredump.intra.peff.net>

On Tue, Jun 27, 2023 at 8:51 AM Jeff King <peff@peff.net> wrote:

> Re-reading your email, though, I wonder if you meant something a little
> simpler, like:

Indeed, I was only having the "git mv" case in mind and to advise() at
the time of that command being run, instead of advise()'ing at "git
commit" time.

-- 
Sebastian Schuberth

^ permalink raw reply

* Re: Clean up stale .gitignore and .gitattribute patterns
From: Jeff King @ 2023-06-27  6:51 UTC (permalink / raw)
  To: Sebastian Schuberth; +Cc: Junio C Hamano, Git Mailing List
In-Reply-To: <CAHGBnuMjCsMetCJfhfDXb7aYttgUOc0WY+wJ_Q-tmoV4WES-pQ@mail.gmail.com>

On Mon, Jun 26, 2023 at 06:42:17PM +0200, Sebastian Schuberth wrote:

> On Mon, Jun 26, 2023 at 5:55 PM Junio C Hamano <gitster@pobox.com> wrote:
> 
> > > PS: As a future idea, it might be good if "git mv" gives a hint about
> > > updating .gitattributes if files matching .gitattributes pattern are
> > > moved.
> >
> > Interesting.  "git mv hello.jpg hello.jpeg" would suggest updating
> > a "*.jpg <list of attribute definitions>" line in the .gitattributes
> > to begin with "*.jpeg"?
> 
> Yes, right. Or as a simpler variant to start with (as patterns might
> match files in different directories, and not all of the matching
> files might be moved), just say that a specific .gitattributes line
> needs updating (or needs to be duplicated / generalized in case files
> in both the old and new location match).

Yeah, I don't think we could ever do anything automated here; a human
needs to judge the intent and how the patterns should be adapted.

But perhaps something like:

  1. When git-commit makes a new commit that removes paths (whether they
     were totally removed, or renamed), find all gitattribute lines
     whose patterns match those paths.

  2. For each such pattern, see if it still matches anything in the
     resulting tree.

  3. If not, print advise() lines showing the file/line of the pattern
     which is no longer used.

Doing so naively (by checking matches for each file in the tree) would
be a little expensive, but maybe OK in practice. It could perhaps be
done more efficiently with specialized code, but it might be tricky to
right (and you still end up O(size of tree) in the worst case, because
something like "*.jpg" needs to be compared against every entry).

Of course on the way there you should end up with a decent tool for
"which patterns are not currently used?". And you could just
periodically run that manually if you want to clean up (or even from a
post-commit hook).


Re-reading your email, though, I wonder if you meant something a little
simpler, like:

  1. When a path is moved via git-mv, see if the attributes before/after
     are the same.

  2. If not, then mention which ones matched the old path via advise().

That is probably easier to write, though it does not help the "git rm"
case (where attributes may become obsolete).

-Peff

^ permalink raw reply

* Re: [PATCH] config: don't BUG when both kvi and source are set
From: Junio C Hamano @ 2023-06-26 23:05 UTC (permalink / raw)
  To: Glen Choo
  Cc: Glen Choo via GitGitGadget, git, Jesse Kasky, Derrick Stolee,
	Jonathan Tan
In-Reply-To: <kl6l7crpsuhs.fsf@chooglen-macbookpro.roam.corp.google.com>

Glen Choo <chooglen@google.com> writes:

> Ah, I meant that this bug occurred because most users of config use
> git_config()/repo_config() (a wrapper around config sets), so it's very
> easy to accidentally read repo config, e.g. in the middle of parsing
> config (config file -> config set). I'd imagine it might also be quite
> easy to read repo config while reading repo config (config set -> config
> set), which would make current_config_* return the wrong thing, but at
> least it doesn't BUG().

I think BUG() is better than silently computing a wrong result, but
it would probably be much rare than the problem at hand, and with
the getting rid of global dependencies, it won't be an issue anymore,
hopefull?  So it is good.

> The "reverse" case (config set -> config file) is very _unlikely_
> because very few places need to know about config files, so it's
> unlikely that we'd have an explicit call to parse a config file,
> especially in the middle of reading repo config.

As long as existing codepaths do not do that, it would be OK ;-)

Thanks.

^ permalink raw reply

* Re: [PATCH] config: don't BUG when both kvi and source are set
From: Glen Choo @ 2023-06-26 22:56 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Jesse Kasky, Derrick Stolee, Jonathan Tan
In-Reply-To: <xmqq352e59h9.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

>> Therefore, fix the bug by removing the BUG() check. We're reverting to
>> an older, less safe state, but that's generally okay since
>> key_value_info is always preferentially read, so we'd always read the
>> correct values when we iterate a config set in the middle of a config
>> parse (like we are here).
>
> I wonder if the source being pushed and config_kvi value at this
> point have some particular relationship (like "if kvi exists, the
> source must match kvi's source" or something) that we can cheaply
> use to avoid "reverting to an older less safe state"?

Not at all. In this case, the source should reflect .gitmodules, but the
config_kvi should reflect the promisor config (aka the full repo
config). config_source implements stack semantics, so we could co-opt it
by e.g. converting config_kvi into a fake config_source and pushing it
onto the stack (at which point, we could just get rid of config_kvi
altogether too), but that's really way too much work for something that
will _hopefully_ go away soon.

>> The reverse would be wrong, but extremely
>> unlikely to happen since very few callers parse config without going
>> through a config set.
>
> Sorry, but I do not quite get this comment.

Ah, I meant that this bug occurred because most users of config use
git_config()/repo_config() (a wrapper around config sets), so it's very
easy to accidentally read repo config, e.g. in the middle of parsing
config (config file -> config set). I'd imagine it might also be quite
easy to read repo config while reading repo config (config set -> config
set), which would make current_config_* return the wrong thing, but at
least it doesn't BUG().

The "reverse" case (config set -> config file) is very _unlikely_
because very few places need to know about config files, so it's
unlikely that we'd have an explicit call to parse a config file,
especially in the middle of reading repo config.

^ permalink raw reply

* [ANNOUNCE] Git Merge 2023 updates
From: Taylor Blau @ 2023-06-26 21:30 UTC (permalink / raw)
  To: git

I wanted to share a couple of updates on the status of Git Merge:

	- GitHub will not be hosting an in-person event for Git Merge this
		year.

  - Instead, we will focus our efforts on hosting a virtual
    Contributor's Summit, the details of which are TBD.

The decision to not host an in-person event was a difficult one to make,
but necessary given the current economic conditions and their effect on
travel budgets for conferences.

When I asked[1] about which location folks preferred for Git Merge, a
number of replies indicated that nobody on their team would be able to
attend unless speaking. I think it's a safe assumption that other
individuals who would be travelling on behalf of their employer are in a
similar situation.

That said, I plan on hosting an virtual Contributor's Summit in
late-September or so for folks who are working on Git to get together.

This year's will likely look a lot different than the Contributor's
Summit we held last year in Chicago, IL. That said, a couple of
requests:

  - If you are interested in joining, and aren't sure whether or not you
    are able to, please feel free to shoot me a message off-list and
    ask.

  - If you have suggestions about the format of the day,
    video-conferencing related suggestions, or anything else that you
    think could improve the Contributor's Summit, please also feel more
    than free to let me know.

I regret not being able to see everybody in person this year, but I'm
hopeful that we'll get a chance to make up for it next year :-). Until
then,

Thanks,
Taylor

[1]: https://lore.kernel.org/git/ZEyDcBcGmLznqKzD@nand.local/

^ permalink raw reply

* Bug: fetch with deepen shortens history
From: Benjamin Stein @ 2023-06-26 21:24 UTC (permalink / raw)
  To: git@vger.kernel.org

Hello git gurus,

Here's an atypical bug report for you. I'm sorry for not starting with the template, but the context/setup are longer than felt useful in that format.

I have what I believe to be a (relatively) simple, reproducible test case (repo setup/steps below) around shallow checkouts at merge commits and deepening where the behavior is quite surprising - I end up with a smaller history after a fetch operation than when I started!

I've tried this on multiple OSes (Linux, Mac/Darwin), shells (zsh, bash) and various git versions ranging from 2.40.1 to 2.34.1, but I suspect it's older than that.

Scenario:
I'm using GitHub actions to look through some commit history and generate a report of commits relative to another branch. The specifics aren't super important, just that I start off in a shallow repo (depth=1) because (1) that's what GHA drops me into by default, and (2) since this is a large mono-repo I don't want to fetch all history every time - so I want to minimize the amount of data fetched.
So I used `fetch --shallow-exclude=other-branch HEAD` to get the relevant commits, and ran into my bug: when I do an extra `fetch --deepen` I end up with only a single commit in history instead of the N I had right before the call. If that's not super clear, I think the reproduction steps below should help.

Even more confusingly, if I check out a merge commit into trunk (the default) it misbehaves, but if I instead start out on the branch I'm inspecting, the same sequence of commands works correctly! This detail threw me for a long time as I tried to understand why behavior after a merge wasn't consistent.

My hunch is that this has to do with .git/shallow and the way some of the SHAs are coalescing when histories intersect/are combined, but I'm not too familiar with the inner workings of shallow. I even tried running the same sequence of steps with various combinations of --update-shallow on the fetches, but that doesn't seem to address the underlying issue of history shortening.

I hope that's all clear. If I can give more detail, definitely let me know, and I'm happy to try and explore some solutions, but I'm not certain where to begin.
I'm also not sure if it's an issue on the client or server - since I was initially testing with GitHub - but in my reduced example everything is local.

Thanks for your time and help with this!

-Benji


------ requisite bugreport answers ------

Thank you for filling out a Git bug report!
Please answer the following questions to help us understand your issue.

What did you do before the bug happened? (Steps to reproduce your issue)
See setup below.
In my shallow checkout, I ran:
    git log --oneline | wc -l => N commits of history.
Then I ran
    git fetch --deepen=1 <branch>
Followed by
    git log --oneline | wc -l => now just one commit.

What did you expect to happen? (Expected behavior)
I expected to see N+1 commits of history because I deepened by 1.

What happened instead? (Actual behavior)
I instead had just one commit of history.

What's different between what you expected and what actually happened?
I expected not to have my local history shortened by ~N commits despite using --deepen instead of --depth.

Anything else you want to add:
See additional notes/steps in this email and attached script to reproduce


[System Info]
git version:
git version 2.40.1
cpu: x86_64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 6.2.6-76060206-generic #202303130630~1681329778~22.04~d824cd4 SMP PREEMPT_DYNAMIC Wed A x86_64
compiler info: gnuc: 11.3
libc info: glibc: 2.35
$SHELL (typically, interactive shell): /bin/zsh


[Enabled Hooks]
not run from a git repository - no hooks to show
(it's irrelevant, my example starts from nothing, so no hooks)


------- bug-setup.sh ----- (better to to run in chunks, rather than all at once, but provided for convenience)

set -x
# Setup working folder for easy cleanup
mkdir git-test && cd git-test

# Setup sample repo
mkdir source-repo && cd source-repo
git init
git branch -m trunk
for i in {01..05}; do echo "start${i}" >> start; git add start; git commit -m "start${i}"; done
git branch old-checkpoint
for i in {01..10}; do echo "new${i}" >> new; git add new; git commit -m "new${i}"; done
git checkout -b feature HEAD~2
for i in {01..03}; do echo "feature${i}" >> feature; git add feature; git commit -m "feature${i}"; done
git checkout trunk
git merge --no-edit feature
cd ..
sleep 1


# Checkout shallow clone at feature branch - this works as desired
git clone --no-local source-repo --depth=1 --branch feature shallow-clone-feature
cd shallow-clone-feature
git remote set-branches --add origin '*'
git fetch origin --shallow-exclude=old-checkpoint feature
git log --oneline origin/feature | wc -l # 11, expected
git fetch --deepen=1 origin feature
git log --oneline origin/feature | wc -l # 12, also as expected
cd ..
sleep 1


# Checkout shallow clone at merge commit - this illustrates the bug
git clone --no-local source-repo --depth=1 --branch trunk shallow-clone-merge
cd shallow-clone-merge
git remote set-branches --add origin '*'
git fetch origin --shallow-exclude=old-checkpoint feature
git log --oneline origin/feature | wc -l # 11, expected
git fetch --deepen=1 origin feature
git log --oneline origin/feature | wc -l # 1, unexpected

# Wait, what? Let's try that again
git fetch origin --shallow-exclude=old-checkpoint feature
git log --oneline origin/feature | wc -l # 11, still as expected
git fetch --deepen=1 origin feature
git log --oneline origin/feature | wc -l # 13, different this time. Unexpected.
cd ..
sleep 1


# What if we expand depth first?
git clone --no-local source-repo --depth=1 --branch trunk shallow-clone-with-depth
cd shallow-clone-with-depth
git remote set-branches --add origin '*'
git fetch --depth=2 origin feature
git fetch origin --shallow-exclude=old-checkpoint feature
git log --oneline origin/feature | wc -l # 11, expected
git fetch --deepen=1 origin feature
git log --oneline origin/feature | wc -l # 1, still unexpected
cd ..
sleep 1


# It turns out the depth query sometimes works if I also manually include HEAD, but still strangely.
# If I use depth=2, it's fine, but if I keep depth=3 it's not.
git clone --no-local source-repo --depth=1 --branch trunk shallow-clone-with-depth
cd shallow-clone-with-depth
git remote set-branches --add origin '*'
git fetch --depth=3 origin HEAD feature # it works if I use or include HEAD, but *not* if I include old-checkpoint
git fetch origin --shallow-exclude=old-checkpoint feature
git log --oneline origin/feature | wc -l # 11, expected
git fetch --deepen=1 origin feature
git log --oneline origin/feature | wc -l # 4, different from before, still wrong
cd ..
sleep 1

# If we start with deepen, instead
git clone --no-local ./source-repo --depth=1 --branch trunk shallow-clone-deepen
cd shallow-clone-deepen
git remote set-branches --add origin '*'
git fetch --deepen=1 origin HEAD feature # this also works if we use feature
git fetch origin --shallow-exclude=old-checkpoint feature
git log --oneline origin/feature | wc -l # 11, expected
git fetch --deepen=1 origin feature
git log --oneline origin/feature | wc -l # 12, that's finally correct
cd ..
sleep 1

# manually clear everything out by running
# cd .. && rm -rf git-test

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox