Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH v3] rebase: clarify --reschedule-failed-exec default
From: Taylor Blau @ 2024-01-05 17:11 UTC (permalink / raw)
  To: Illia Bobyr; +Cc: git
In-Reply-To: <20240105011424.1443732-2-illia.bobyr@gmail.com>

On Thu, Jan 04, 2024 at 05:14:26PM -0800, Illia Bobyr wrote:
> ---
>  Documentation/git-rebase.txt | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)

LGTM.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH] commit-graph: retain commit slab when closing NULL commit_graph
From: Taylor Blau @ 2024-01-05 17:07 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano, Scott Leggett
In-Reply-To: <20240105054142.GA2035092@coredump.intra.peff.net>

On Fri, Jan 05, 2024 at 12:41:42AM -0500, Jeff King wrote:
> This fixes a regression introduced in ac6d45d11f (commit-graph: move
> slab-clearing to close_commit_graph(), 2023-10-03), in which running:
>
>   git -c fetch.writeCommitGraph=true fetch --recurse-submodules
>
> multiple times in a freshly cloned repository causes a segfault. What
> happens in the second (and subsequent) runs is this:
>
>   1. We make a "struct commit" for any ref tips which we're storing
>      (even if we already have them, they still go into FETCH_HEAD).
>
>      Because the first run will have created a commit graph, we'll find
>      those commits in the graph.
>
>      The commit struct is therefore created with a NULL "maybe_tree"
>      entry, because we can load its oid from the graph later. But to do
>      that we need to remember that we got the commit from the graph,
>      which is recorded in a global commit_graph_data_slab object.
>
>   2. Because we're using --recurse-submodules, we'll try to fetch each
>      of the possible submodules. That implies creating a separate
>      "struct repository" in-process for each submodule, which will
>      require a later call to repo_clear().
>
>      The call to repo_clear() calls raw_object_store_clear(), which in
>      turn calls close_object_store(), which in turn calls
>      close_commit_graph(). And the latter frees the commit graph data
>      slab.
>
>   3. Later, when trying to write out a new commit graph, we'll ask for
>      their tree oid via get_commit_tree_oid(), which will see that the
>      object is parsed but with a NULL maybe_tree field. We'd then
>      usually pull it from the graph file, but because the slab was
>      cleared, we don't realize that we can do so! We end up returning
>      NULL and segfaulting.
>
>      (It seems questionable that we'd write a graph entry for such a
>      commit anyway, since we know we already have one. I didn't
>      double-check, but that may simply be another side effect of having
>      cleared the slab).

Yeah, I agree that we should not be re-writing a commit-graph entry that
already exists in an earlier (non-removed) layer of the commit-graph.
I haven't thought through all of the details here, but that sounds like
a potentially dangerous thing to be doing.

> So that fixes the regression in the least-risky way possible.

Thanks for the detailed explanation. I think that the fix presented here
is a reasonable approach, and is worthwhile to pick up.

> I do think there's some fragility here that we might want to follow up
> on. We have a global commit_graph_data_slab that contains graph
> positions, and our global commit structs depend on the that slab
> remaining valid. But close_commit_graph() is just about closing _one_
> object store's graph. So it's dangerous to call that function and clear
> the slab without also throwing away any "struct commit" we might have
> parsed that depends on it.

I definitely agree that there seems like a high risk of another
potentially-dangerous bug happening as a result of this area.

One thing that sticks out to me is that we have a scope mismatch
between the commit structs, the raw_object_store, and the commit slabs.
Slabs are tied to the lifetime of the raw_object_store, but the commit
objects are tied to the global lifetime of the process. I wonder if it
would make sense to have a slab per object-store, and require that you
pass both the commit *and* the object-store you're looking it up in as
arguments to any slab-related functions.

I dunno. I have not put a ton of thought into that ^^ approach either,
so let me know if it seems obviously bogus to you or if this is worth
looking into further.

Thanks,
Taylor

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: Junio C Hamano @ 2024-01-05 16:34 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, git
In-Reply-To: <20240105085928.GA3078702@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Wed, Jan 03, 2024 at 08:37:59AM -0800, Junio C Hamano wrote:
>
>> This is totally unrelated tangent, but the way "show-index" gets
>> invoked in the above loop makes readers wonder how the caller found
>> out which $idx file to read.
>> 
>> Of course, the above loop sits downstream of a pipe
>> 
>>     find .git/objects/pack -type f -name \*.idx
>> 
>> which means that any user of "git show-index" must be intimately
>> familiar with how the object database is structured.  I wonder if we
>> want an extra layer of abstraction, similar to how the reference
>> database can have different backend implementation.
>
> I assume you mean a test helper by "extra layer of abstraction".

Well, that might be a good first step, but I was envisioning more
into far future where the object store is reimplemented on top of
high-performance database, at which point we may want subcommands
like "git odb list-packs" that lists packfile names (which may not
name any on-filesystem files) that can be given to "git show-index",
for example.  Of course, the abstraction boundary of the object
store might be placed a lot higher and the interface into it may be
limited to "here is an oid, give me its contents" without distinction
between the objects in packfiles and objects in loose object files.

> So I think we should just punt on it for now. Adding one case here is
> not making a hypothetical abstraction layer significantly harder to add
> later.

Totally agree, and that is why I labeled the whole thing as an
unrelated tangent.

Thanks.

^ permalink raw reply

* Re: [PATCH] index-pack: spawn threads atomically
From: Taylor Blau @ 2024-01-05 16:33 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20240105085034.GA3078476@coredump.intra.peff.net>

On Fri, Jan 05, 2024 at 03:50:34AM -0500, Jeff King wrote:
> The t5309 script triggers a racy false positive with SANITIZE=leak on a
> multi-core system. Running with "--stress --run=6" usually fails within
> 10 seconds or so for me, complaining with something like:
>
>     + git index-pack --fix-thin --stdin
>     fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?)
>
>     =================================================================
>     ==3904583==ERROR: LeakSanitizer: detected memory leaks
>
>     Direct leak of 32 byte(s) in 1 object(s) allocated from:
>         #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
>         #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
>         #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
>         #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
>         #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
>         #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
>         #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
>         #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>
>     SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s).
>     Aborted

We discussed this in another thread (beginning here [1]), and I would be
fine with this approach. I share your feeling that it is a little gross
to have to work around LSan's implementation by tweaking production
code, but I think that this doesn't have to be the most pristine patch
ever written, either ;-).

Just playing devil's advocate for a moment, I wonder if another approach
might be to disable the threading altogether for the purposes of this
test. The performance difference is negligible, and I don't think we're
exercising any interesting paths in this particular test that have to do
with pack.threads > 1 that aren't covered extensively elsewhere.

So, in other words, I think a reasonable approach would be to do
something like:

--- 8< ---
diff --git a/t/t5309-pack-delta-cycles.sh b/t/t5309-pack-delta-cycles.sh
index 4e910c5b9d..1d132b6324 100755
--- a/t/t5309-pack-delta-cycles.sh
+++ b/t/t5309-pack-delta-cycles.sh
@@ -73,7 +73,7 @@ test_expect_success 'failover to a duplicate object in the same pack' '
 		pack_obj $A
 	} >recoverable.pack &&
 	pack_trailer recoverable.pack &&
-	test_must_fail git index-pack --fix-thin --stdin <recoverable.pack
+	test_must_fail git index-pack --threads=1 --fix-thin --stdin <recoverable.pack
 '

 test_done
--- >8 ---

And call it a day. I built with SANITIZE=leak, and then ran:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t5309-pack-delta-cycles.sh --stress --run=6

For a while and didn't see any failures. That could be luck, of course,
but without the above patch I was seeing failures within a few seconds.
I'm reasonably confident that this would do the trick.

For what it's worth, I'm fine with either approach, mostly to avoid
tying up more of the list's time discussing the options. But I have a
vague preference towards `--threads=1` since it doesn't require us to
touch production code.

> Rescuing this from:
>
>   https://lore.kernel.org/git/20231221105124.GD570888@coredump.intra.peff.net/
>
> where it was buried deep in a thread. I still think it's kind of gross,
> but it may be the least-bad thing.

In either case, thanks for digging it back up :-).

Thanks,
Taylor

[1]: https://lore.kernel.org/git/xmqqbkasnxba.fsf@gitster.g/

^ permalink raw reply related

* Re: rebase invoking pre-commit
From: Junio C Hamano @ 2024-01-05 16:26 UTC (permalink / raw)
  To: Sean Allred; +Cc: Git List
In-Reply-To: <m0sf3vi86g.fsf@epic96565.epic.com>

Sean Allred <allred.sean@gmail.com> writes:

> Is there a current reason why pre-commit shouldn't be invoked during
> rebase, or is this just waiting for a reviewable patch?
>
> This was brought up before at [1] in 2015, but that thread so old at
> this point that it seemed prudent to double-check before investing time
> in a developing and testing a patch.
>
> [1]: https://lore.kernel.org/git/1m55i3m.1fum4zo1fpnhncM%25lists@haller-berlin.de/

If you are trying to make it less likely that your developers would
commit conflict markers by mistake, I think an effective way would
be to give "git rebase" an option (or a configuration variable) that
forbids it from making a new commit upon "git rebase --continue",
which AFAIK was added merely to help "lazy" folks to omit the
explicit "git commit" step in the following sequence:

    $ git rebase origin/master
    ... stops with conflicts
    $ edit
    ... now the conflicts are resolved (and hopefully you have
    ... tested the result)
    $ git commit
    $ git rebase --continue


^ permalink raw reply

* Re: [PATCH] push: region_leave trace for negotiate_using_fetch
From: Junio C Hamano @ 2024-01-05 16:18 UTC (permalink / raw)
  To: Sam Delmerico; +Cc: git, steadmon
In-Reply-To: <CAHVcGPSSJr7L_NyFSKEkEywBap6hUrucga98RLpa6xuZ0k4CzA@mail.gmail.com>

Sam Delmerico <delmerico@google.com> writes:

> * I don't exactly remember how I noticed it. I was doing some
> debugging around the push negotiation code and either 1) saw this
> region get entered twice in the trace output, or 2) I was just reading
> the code around here and saw two enters.
>
> * Perhaps there could be a check before the last git process ends that
> checks that all opened trace regions have been closed? I'm not sure
> how much work this would involve. It's probably also not a very
> proactive way to catch these bugs since it would only get triggered
> when a *user* hits a code path with a trace region that never exits.
>
> * There could also be a test that checks that every region_enter trace
> log has a corresponding region_leave. But I'm not sure how to ensure
> that every code path is checked.
>
> Overall, I'm not sure how much benefit there is from checking for
> this. I'm not sure that it would have a large impact if it were to
> happen again. For example, I think that it could be noticed relatively
> quickly by a person/system looking at metrics like I was (e.g. if the
> time spent in a region is infinite or zero).
>
> FWIW I didn't see any other examples of this when going through logs.

The above matches my intuition.  A test that covers this specific
case would likely be of low value as it is unlikely for us to
regress this specific one.  A CI job that runs all the tests under
tracing and inspects the log may catch some but its finding would be
limited to the code paths that are covered (but increasing the test
coverage would help here, not just for finding unbalanced region
markers, but for finding bugs in general).

Thanks.

^ permalink raw reply

* [Outreachy][PATCH v4] Port helper/test-ctype.c to unit-tests/t-ctype.c
From: Achu Luma @ 2024-01-05 16:14 UTC (permalink / raw)
  To: git
  Cc: gitster, chriscool, christian.couder, l.s.r, phillip.wood,
	steadmon, me, Achu Luma, Phillip Wood
In-Reply-To: <20240101104017.9452-2-ach.lumap@gmail.com>

In the recent codebase update (8bf6fbd00d (Merge branch
'js/doc-unit-tests', 2023-12-09)), a new unit testing framework was
merged, providing a standardized approach for testing C code. Prior to
this update, some unit tests relied on the test helper mechanism,
lacking a dedicated unit testing framework. It's more natural to perform
these unit tests using the new unit test framework.

This commit migrates the unit tests for C character classification
functions (isdigit(), isspace(), etc) from the legacy approach
using the test-tool command `test-tool ctype` in t/helper/test-ctype.c
to the new unit testing framework (t/unit-tests/test-lib.h).

The migration involves refactoring the tests to utilize the testing
macros provided by the framework (TEST() and check_*()).

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
---
 The changes between version 3 and version 4 are the following:

 - Some duplication has been reduced using a new TEST_CHAR_CLASS() macro.
 - A "0x"prefix has been added to avoid confusion between decimal and
   hexadecimal codes printed by test_msg().
 - The "Mentored-by:..." trailer has been restored.
 - "works as expected" has been reduced to just "works" as suggested by Taylor.
 - Some "Helped-by: ..." trailers have been added.
 - Some whitespace fixes have been made.

 Thanks to Junio, René, Phillip and Taylor for commenting on previous versions
 of this patch.
 Here is a diff between v3 and v4:

  @@ -14,15 +14,16 @@ static int is_in(const char *s, int ch)
  }

  /* Macro to test a character type */
 -#define TEST_CTYPE_FUNC(func, string)                          \
 -static void test_ctype_##func(void)                            \
 -{                                                              \
 -       for (int i = 0; i < 256; i++) {                         \
 -               if (!check_int(func(i), ==, is_in(string, i)))  \
 -                       test_msg("       i: %02x", i);          \
 -        }                                                      \
 +#define TEST_CTYPE_FUNC(func, string) \
 +static void test_ctype_##func(void) { \
 +       for (int i = 0; i < 256; i++) { \
 +               if (!check_int(func(i), ==, is_in(string, i))) \
 +                       test_msg("       i: 0x%02x", i); \
 +       } \
   }

 +#define TEST_CHAR_CLASS(class) TEST(test_ctype_##class(), #class " works")
 +
  #define DIGIT "0123456789"
  #define LOWER "abcdefghijklmnopqrstuvwxyz"
  #define UPPER "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
 @@ -58,20 +59,20 @@ TEST_CTYPE_FUNC(isprint, LOWER UPPER DIGIT PUNCT " ")

  int cmd_main(int argc, const char **argv) {
         /* Run all character type tests */
 -       TEST(test_ctype_isspace(), "isspace() works as we expect");
 -       TEST(test_ctype_isdigit(), "isdigit() works as we expect");
 -       TEST(test_ctype_isalpha(), "isalpha() works as we expect");
 -       TEST(test_ctype_isalnum(), "isalnum() works as we expect");
 -       TEST(test_ctype_is_glob_special(), "is_glob_special() works as we expect");
 -       TEST(test_ctype_is_regex_special(), "is_regex_special() works as we expect");
 -       TEST(test_ctype_is_pathspec_magic(), "is_pathspec_magic() works as we expect");
 -       TEST(test_ctype_isascii(), "isascii() works as we expect");
 -       TEST(test_ctype_islower(), "islower() works as we expect");
 -       TEST(test_ctype_isupper(), "isupper() works as we expect");
 -       TEST(test_ctype_iscntrl(), "iscntrl() works as we expect");
 -       TEST(test_ctype_ispunct(), "ispunct() works as we expect");
 -       TEST(test_ctype_isxdigit(), "isxdigit() works as we expect");
 -       TEST(test_ctype_isprint(), "isprint() works as we expect");
 +       TEST_CHAR_CLASS(isspace);
 +       TEST_CHAR_CLASS(isdigit);
 +       TEST_CHAR_CLASS(isalpha);
 +       TEST_CHAR_CLASS(isalnum);
 +       TEST_CHAR_CLASS(is_glob_special);
 +       TEST_CHAR_CLASS(is_regex_special);
 +       TEST_CHAR_CLASS(is_pathspec_magic);
 +       TEST_CHAR_CLASS(isascii);
 +       TEST_CHAR_CLASS(islower);
 +       TEST_CHAR_CLASS(isupper);
 +       TEST_CHAR_CLASS(iscntrl);
 +       TEST_CHAR_CLASS(ispunct);
 +       TEST_CHAR_CLASS(isxdigit);
 +       TEST_CHAR_CLASS(isprint);

 Makefile               |  2 +-
 t/helper/test-ctype.c  | 70 -------------------------------------
 t/helper/test-tool.c   |  1 -
 t/helper/test-tool.h   |  1 -
 t/t0070-fundamental.sh |  4 ---
 t/unit-tests/t-ctype.c | 78 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 79 insertions(+), 77 deletions(-)
 delete mode 100644 t/helper/test-ctype.c
 create mode 100644 t/unit-tests/t-ctype.c

diff --git a/Makefile b/Makefile
index 88ba7a3c51..a4df48ba65 100644
--- a/Makefile
+++ b/Makefile
@@ -792,7 +792,6 @@ TEST_BUILTINS_OBJS += test-chmtime.o
 TEST_BUILTINS_OBJS += test-config.o
 TEST_BUILTINS_OBJS += test-crontab.o
 TEST_BUILTINS_OBJS += test-csprng.o
-TEST_BUILTINS_OBJS += test-ctype.o
 TEST_BUILTINS_OBJS += test-date.o
 TEST_BUILTINS_OBJS += test-delta.o
 TEST_BUILTINS_OBJS += test-dir-iterator.o
@@ -1341,6 +1340,7 @@ THIRD_PARTY_SOURCES += sha1dc/%

 UNIT_TEST_PROGRAMS += t-basic
 UNIT_TEST_PROGRAMS += t-strbuf
+UNIT_TEST_PROGRAMS += t-ctype
 UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
diff --git a/t/helper/test-ctype.c b/t/helper/test-ctype.c
deleted file mode 100644
index e5659df40b..0000000000
--- a/t/helper/test-ctype.c
+++ /dev/null
@@ -1,70 +0,0 @@
-#include "test-tool.h"
-
-static int rc;
-
-static void report_error(const char *class, int ch)
-{
-	printf("%s classifies char %d (0x%02x) wrongly\n", class, ch, ch);
-	rc = 1;
-}
-
-static int is_in(const char *s, int ch)
-{
-	/*
-	 * We can't find NUL using strchr. Accept it as the first
-	 * character in the spec -- there are no empty classes.
-	 */
-	if (ch == '\0')
-		return ch == *s;
-	if (*s == '\0')
-		s++;
-	return !!strchr(s, ch);
-}
-
-#define TEST_CLASS(t,s) {			\
-	int i;					\
-	for (i = 0; i < 256; i++) {		\
-		if (is_in(s, i) != t(i))	\
-			report_error(#t, i);	\
-	}					\
-	if (t(EOF))				\
-		report_error(#t, EOF);		\
-}
-
-#define DIGIT "0123456789"
-#define LOWER "abcdefghijklmnopqrstuvwxyz"
-#define UPPER "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
-#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
-#define ASCII \
-	"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \
-	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
-	"\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f" \
-	"\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f" \
-	"\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" \
-	"\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f" \
-	"\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f" \
-	"\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f"
-#define CNTRL \
-	"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \
-	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
-	"\x7f"
-
-int cmd__ctype(int argc UNUSED, const char **argv UNUSED)
-{
-	TEST_CLASS(isdigit, DIGIT);
-	TEST_CLASS(isspace, " \n\r\t");
-	TEST_CLASS(isalpha, LOWER UPPER);
-	TEST_CLASS(isalnum, LOWER UPPER DIGIT);
-	TEST_CLASS(is_glob_special, "*?[\\");
-	TEST_CLASS(is_regex_special, "$()*+.?[\\^{|");
-	TEST_CLASS(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~");
-	TEST_CLASS(isascii, ASCII);
-	TEST_CLASS(islower, LOWER);
-	TEST_CLASS(isupper, UPPER);
-	TEST_CLASS(iscntrl, CNTRL);
-	TEST_CLASS(ispunct, PUNCT);
-	TEST_CLASS(isxdigit, DIGIT "abcdefABCDEF");
-	TEST_CLASS(isprint, LOWER UPPER DIGIT PUNCT " ");
-
-	return rc;
-}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 37ba996539..33b9501c21 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -19,7 +19,6 @@ static struct test_cmd cmds[] = {
 	{ "config", cmd__config },
 	{ "crontab", cmd__crontab },
 	{ "csprng", cmd__csprng },
-	{ "ctype", cmd__ctype },
 	{ "date", cmd__date },
 	{ "delta", cmd__delta },
 	{ "dir-iterator", cmd__dir_iterator },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 8a1a7c63da..b72f07ded9 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -12,7 +12,6 @@ int cmd__chmtime(int argc, const char **argv);
 int cmd__config(int argc, const char **argv);
 int cmd__crontab(int argc, const char **argv);
 int cmd__csprng(int argc, const char **argv);
-int cmd__ctype(int argc, const char **argv);
 int cmd__date(int argc, const char **argv);
 int cmd__delta(int argc, const char **argv);
 int cmd__dir_iterator(int argc, const char **argv);
diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 487bc8d905..a4756fbab9 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -9,10 +9,6 @@ Verify wrappers and compatibility functions.
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh

-test_expect_success 'character classes (isspace, isalpha etc.)' '
-	test-tool ctype
-'
-
 test_expect_success 'mktemp to nonexistent directory prints filename' '
 	test_must_fail test-tool mktemp doesnotexist/testXXXXXX 2>err &&
 	grep "doesnotexist/test" err
diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
new file mode 100644
index 0000000000..3a338df541
--- /dev/null
+++ b/t/unit-tests/t-ctype.c
@@ -0,0 +1,78 @@
+#include "test-lib.h"
+
+static int is_in(const char *s, int ch)
+{
+	/*
+	 * We can't find NUL using strchr. Accept it as the first
+	 * character in the spec -- there are no empty classes.
+	 */
+	if (ch == '\0')
+		return ch == *s;
+	if (*s == '\0')
+		s++;
+	return !!strchr(s, ch);
+}
+
+/* Macro to test a character type */
+#define TEST_CTYPE_FUNC(func, string) \
+static void test_ctype_##func(void) { \
+	for (int i = 0; i < 256; i++) { \
+		if (!check_int(func(i), ==, is_in(string, i))) \
+			test_msg("       i: 0x%02x", i); \
+	} \
+}
+
+#define TEST_CHAR_CLASS(class) TEST(test_ctype_##class(), #class " works")
+
+#define DIGIT "0123456789"
+#define LOWER "abcdefghijklmnopqrstuvwxyz"
+#define UPPER "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
+#define ASCII \
+	"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \
+	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
+	"\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f" \
+	"\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f" \
+	"\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" \
+	"\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f" \
+	"\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f" \
+	"\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f"
+#define CNTRL \
+	"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \
+	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
+	"\x7f"
+
+TEST_CTYPE_FUNC(isdigit, DIGIT)
+TEST_CTYPE_FUNC(isspace, " \n\r\t")
+TEST_CTYPE_FUNC(isalpha, LOWER UPPER)
+TEST_CTYPE_FUNC(isalnum, LOWER UPPER DIGIT)
+TEST_CTYPE_FUNC(is_glob_special, "*?[\\")
+TEST_CTYPE_FUNC(is_regex_special, "$()*+.?[\\^{|")
+TEST_CTYPE_FUNC(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~")
+TEST_CTYPE_FUNC(isascii, ASCII)
+TEST_CTYPE_FUNC(islower, LOWER)
+TEST_CTYPE_FUNC(isupper, UPPER)
+TEST_CTYPE_FUNC(iscntrl, CNTRL)
+TEST_CTYPE_FUNC(ispunct, PUNCT)
+TEST_CTYPE_FUNC(isxdigit, DIGIT "abcdefABCDEF")
+TEST_CTYPE_FUNC(isprint, LOWER UPPER DIGIT PUNCT " ")
+
+int cmd_main(int argc, const char **argv) {
+	/* Run all character type tests */
+	TEST_CHAR_CLASS(isspace);
+	TEST_CHAR_CLASS(isdigit);
+	TEST_CHAR_CLASS(isalpha);
+	TEST_CHAR_CLASS(isalnum);
+	TEST_CHAR_CLASS(is_glob_special);
+	TEST_CHAR_CLASS(is_regex_special);
+	TEST_CHAR_CLASS(is_pathspec_magic);
+	TEST_CHAR_CLASS(isascii);
+	TEST_CHAR_CLASS(islower);
+	TEST_CHAR_CLASS(isupper);
+	TEST_CHAR_CLASS(iscntrl);
+	TEST_CHAR_CLASS(ispunct);
+	TEST_CHAR_CLASS(isxdigit);
+	TEST_CHAR_CLASS(isprint);
+
+	return test_done();
+}
--
2.42.0.windows.2


^ permalink raw reply related

* Re: [PATCH v2 2/2] fetch: add cli option --default-only
From: Junio C Hamano @ 2024-01-05 16:13 UTC (permalink / raw)
  To: Tamino Bauknecht; +Cc: git
In-Reply-To: <20240104222259.15659-2-dev@tb6.eu>

Tamino Bauknecht <dev@tb6.eu> writes:

> This option can be used to restore the default behavior of "git fetch"
> if the "fetch.all" config option is enabled.
> The flag cannot be used in combination with "--all" or explicit
> remote(s).

There is "--all" option that can be used to alter the behaviour of
the command.  This is OPT_BOOL(), and if you have

    [alias] f = fetch --all

you can say "git f --no-all" to countermand it from the command
line, as it is equivalent to "git fetch --all --no-all" and the last
one wins.

If you add "fetch.all" that makes the command behave as if it was
given "--all" from the command line even when the user didn't,
passing "--no-all" would be a lot more natural way to countermand
it, no?

"git fetch" (no parameters) are omitting two kinds of stuff, i.e.,
where we fetch from (repository) and what we are fetching from there
(refspec).  You created a need to override the configured fetch
source (aka fetch.all), and in order to countermand it, one way is
to tell the command to "fetch from the default repository", but for
that, an option that does not say "repository" anywhere is closing
door for future evolution of the command.  What if the next person
(which could be you) invented a way to say what refspec to be used
without specifying it from the command line, and needs to say "no,
no, no, do not use the configured value, but just use the default"?
In other words, "--default" will be met by a reaction "default of
which one???".

Compared to that, I would think "--[no-]all" would be a lot simpler
to understand.

The best part is that you do not have to add a new option.  Just
make sure the one specified from the command line, either --all
or --no-all, will cause the command to ignore the configured
fetch.all and you do not need to add anything new to the UI.

^ permalink raw reply

* Re: [GSOC][RFC] Heed core.bare from template config file when no command line override given, as a microproject.
From: Junio C Hamano @ 2024-01-05 15:59 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Christian Couder, Ghanshyam Thakkar, git, johannes.schindelin
In-Reply-To: <CABPp-BH+cPdfsctquE60tw_nD6_LCaWf0JwGusuZ0tvQQuWy4w@mail.gmail.com>

Elijah Newren <newren@gmail.com> writes:

> If you look back at the mailing list discussion on the series that
> introduced this TODO comment you are trying to address, you'll note
> that both Glen and I dug into the code and attempted to explain it to
> each other, and we both got it wrong on our first try.

I think you meant 0f7443bd (init-db: document existing bug with
core.bare in template config, 2023-05-16), where it says:

    The comments in create_default_files() talks about reading config from
    the config file in the specified `--templates` directory, which leads to
    the question of whether core.bare could be set in such a config file and
    thus whether the code is doing the right thing.

But I suspect the all of the above comes from a misunderstanding.
The comment the above commit log message talks about is:

 /*
  * First copy the templates -- we might have the default
  * config file there, in which case we would want to read
  * from it after installing.
  *
  * Before reading that config, we also need to clear out any cached
  * values (since we've just potentially changed what's available on
  * disk).
  */

This primarily comes from my 4f629539 (init-db: check template and
repository format., 2005-11-25), whose focus was to control the way
HEAD symref is created, but care was taken to avoid propagating
values from the configuration variables in the template that do not
make sense for the repository being initialized.  The most important
thing being the repository format version, but the intent to avoid
nonsense combination between the characteristic the new repository
has and the configuration values copied from the template was not
limited to the format version.

Specifically, the commit that introduced the comment never wanted to
honor core.bare in the template.  I do not think I has core.bare in
mind when I wrote the comment, but I would have described it as the
same category as the repository format version, i.e. something you
would not want to copy, if I were pressed to clarify back then.

Besides, create_default_files() is way too late, even if we wanted
to create a bare repository when the template config file says
core.bare = true, as the caller would already have created before
passing $DIR (when "git --bare init $DIR" was run) or $DIR/.git
(when "git init $DIR" was run) to the function.

If somebody wants to always create a bare repository by having
core.bare=true in their template and if we wanted to honor it (which
I am dubious of the value of, by the way), I would think the right
place to do so would be way before create_default_files() is called.
When running "git init [$DIR]", long before calling init_db(), we
decide if we are about to create a bare repository and either create
$DIR or $DIR/.git.  What is in the template, if we really wanted to
do so, should be read before that happens, no?

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)
From: Jeff King @ 2024-01-05  8:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, git
In-Reply-To: <xmqq8r56bcew.fsf@gitster.g>

On Wed, Jan 03, 2024 at 08:37:59AM -0800, Junio C Hamano wrote:

> This is totally unrelated tangent, but the way "show-index" gets
> invoked in the above loop makes readers wonder how the caller found
> out which $idx file to read.
> 
> Of course, the above loop sits downstream of a pipe
> 
>     find .git/objects/pack -type f -name \*.idx
> 
> which means that any user of "git show-index" must be intimately
> familiar with how the object database is structured.  I wonder if we
> want an extra layer of abstraction, similar to how the reference
> database can have different backend implementation.

I assume you mean a test helper by "extra layer of abstraction".

That could make sense, though I think this is just the tip of the
iceberg. There are a bazillion tests that muck with .git/objects/
directly (especially ones finding and munging loose objects). So it
wouldn't do much good until we know what cases the abstract code would
have to handle. And I don't think we have any concrete alternative yet
to the current object-dir layout.

So I think we should just punt on it for now. Adding one case here is
not making a hypothetical abstraction layer significantly harder to add
later.

-Peff

^ permalink raw reply

* [PATCH] index-pack: spawn threads atomically
From: Jeff King @ 2024-01-05  8:50 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau

The t5309 script triggers a racy false positive with SANITIZE=leak on a
multi-core system. Running with "--stress --run=6" usually fails within
10 seconds or so for me, complaining with something like:

    + git index-pack --fix-thin --stdin
    fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?)

    =================================================================
    ==3904583==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 32 byte(s) in 1 object(s) allocated from:
        #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
        #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
        #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
        #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
        #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
        #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
        #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
        #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

    SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s).
    Aborted

What happens is this:

  0. We construct a bogus pack with a duplicate object in it and trigger
     index-pack.

  1. We spawn a bunch of worker threads to resolve deltas (on my system
     it is 16 threads).

  2. One of the threads sees the duplicate object and bails by calling
     exit(), taking down all of the threads. This is expected and is the
     point of the test.

  3. At the time exit() is called, we may still be spawning threads from
     the main process via pthread_create(). LSan hooks thread creation
     to update its book-keeping; it has to know where each thread's
     stack is (so it can find entry points for reachable memory). So it
     calls pthread_getattr_np() to get information about the new thread.
     That may allocate memory that must be freed with a matching call to
     pthread_attr_destroy(). Probably LSan does that immediately, but
     if you're unlucky enough, the exit() will happen while it's between
     those two calls, and the allocated pthread_attr_t appears as a
     leak.

This isn't a real leak. It's not even in our code, but rather in the
LSan instrumentation code. So we could just ignore it. But the false
positive can cause people to waste time tracking it down.

It's possibly something that LSan could protect against (e.g., cover the
getattr/destroy pair with a mutex, and then in the final post-exit()
check for leaks try to take the same mutex). But I don't know enough
about LSan to say if that's a reasonable approach or not (or if my
analysis is even completely correct).

In the meantime, it's pretty easy to avoid the race by making creation
of the worker threads "atomic". That is, we'll spawn all of them before
letting any of them start to work. That's easy to do because we already
have a work_lock() mutex for handing out that work. If the main process
takes it, then all of the threads will immediately block until we've
finished spawning and released it.

This shouldn't make any practical difference for non-LSan runs. The
thread spawning is quick, and could happen before any worker thread gets
scheduled anyway.

Probably other spots that use threads are subject to the same issues.
But since we have to manually insert locking (and since this really is
kind of a hack), let's not bother with them unless somebody experiences
a similar racy false-positive in practice.

Signed-off-by: Jeff King <peff@peff.net>
---
Rescuing this from:

  https://lore.kernel.org/git/20231221105124.GD570888@coredump.intra.peff.net/

where it was buried deep in a thread. I still think it's kind of gross,
but it may be the least-bad thing.

 builtin/index-pack.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index dda94a9f46..0e94819216 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1257,13 +1257,15 @@ static void resolve_deltas(void)
 	base_cache_limit = delta_base_cache_limit * nr_threads;
 	if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) {
 		init_thread();
+		work_lock();
 		for (i = 0; i < nr_threads; i++) {
 			int ret = pthread_create(&thread_data[i].thread, NULL,
 						 threaded_second_pass, thread_data + i);
 			if (ret)
 				die(_("unable to create thread: %s"),
 				    strerror(ret));
 		}
+		work_unlock();
 		for (i = 0; i < nr_threads; i++)
 			pthread_join(thread_data[i].thread, NULL);
 		cleanup_thread();
-- 
2.43.0.553.g8113c77dd0

^ permalink raw reply related

* [PATCH] commit-graph: retain commit slab when closing NULL commit_graph
From: Jeff King @ 2024-01-05  5:41 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Scott Leggett, Taylor Blau

This fixes a regression introduced in ac6d45d11f (commit-graph: move
slab-clearing to close_commit_graph(), 2023-10-03), in which running:

  git -c fetch.writeCommitGraph=true fetch --recurse-submodules

multiple times in a freshly cloned repository causes a segfault. What
happens in the second (and subsequent) runs is this:

  1. We make a "struct commit" for any ref tips which we're storing
     (even if we already have them, they still go into FETCH_HEAD).

     Because the first run will have created a commit graph, we'll find
     those commits in the graph.

     The commit struct is therefore created with a NULL "maybe_tree"
     entry, because we can load its oid from the graph later. But to do
     that we need to remember that we got the commit from the graph,
     which is recorded in a global commit_graph_data_slab object.

  2. Because we're using --recurse-submodules, we'll try to fetch each
     of the possible submodules. That implies creating a separate
     "struct repository" in-process for each submodule, which will
     require a later call to repo_clear().

     The call to repo_clear() calls raw_object_store_clear(), which in
     turn calls close_object_store(), which in turn calls
     close_commit_graph(). And the latter frees the commit graph data
     slab.

  3. Later, when trying to write out a new commit graph, we'll ask for
     their tree oid via get_commit_tree_oid(), which will see that the
     object is parsed but with a NULL maybe_tree field. We'd then
     usually pull it from the graph file, but because the slab was
     cleared, we don't realize that we can do so! We end up returning
     NULL and segfaulting.

     (It seems questionable that we'd write a graph entry for such a
     commit anyway, since we know we already have one. I didn't
     double-check, but that may simply be another side effect of having
     cleared the slab).

The bug is in step (2) above. We should not be clearing the slab when
cleaning up the submodule repository structs. Prior to ac6d45d11f, we
did not do so because it was done inside a helper function that returned
early when it saw NULL. So the behavior change from that commit is that
we'll now _always_ clear the slab via repo_clear(), even if the
repository being closed did not have a commit graph (and thus would have
a NULL commit_graph struct).

The most immediate fix is to add in a NULL check in close_commit_graph(),
making it a true noop when passed in an object_store with a NULL
commit_graph (it's OK to just return early, since the rest of its code
is already a noop when passed NULL). That restores the pre-ac6d45d11f
behavior. And that's what this patch does, along with a test that
exercises it (we already have a test that uses submodules along with
fetch.writeCommitGraph, but the bug only triggers when there is a
subsequent fetch and when that fetch uses --recurse-submodules).

So that fixes the regression in the least-risky way possible.

I do think there's some fragility here that we might want to follow up
on. We have a global commit_graph_data_slab that contains graph
positions, and our global commit structs depend on the that slab
remaining valid. But close_commit_graph() is just about closing _one_
object store's graph. So it's dangerous to call that function and clear
the slab without also throwing away any "struct commit" we might have
parsed that depends on it.

Which at first glance seems like a bug we could already trigger. In the
situation described here, there is no commit graph in the submodule
repository, so our commit graph is NULL (in fact, in our test script
there is no submodule repo at all, so we immediately return from
repo_init() and call repo_clear() only to free up memory). But what
would happen if there was one? Wouldn't we see a non-NULL commit_graph
entry, and then clear the global slab anyway?

The answer is "no", but for very bizarre reasons. Remember that
repo_clear() calls raw_object_store_clear(), which then calls
close_object_store() and thus close_commit_graph(). But before it does
so, raw_object_store_clear() does something else: it frees the commit
graph and sets it to NULL! So by this code path we'll _never_ see a
non-NULL commit_graph struct, and thus never clear the slab.

So it happens to work out. But it still seems questionable to me that we
would clear a global slab (which might still be in use) when closing the
commit graph. This clearing comes from 957ba814bf (commit-graph: when
closing the graph, also release the slab, 2021-09-08), and was fixing a
case where we really did need it to be closed (and in that case we
presumably call close_object_store() more directly).

So I suspect there may still be a bug waiting to happen there, as any
object loaded before the call to close_object_store() may be stranded
with a bogus maybe_tree entry (and thus looking at it after the call
might cause an error). But I'm not sure how to trigger it, nor what the
fix should look like (you probably would need to "unparse" any objects
pulled from the graph). And so this patch punts on that for now in favor
of fixing the recent regression in the most direct way, which should not
have any other fallouts.

Signed-off-by: Jeff King <peff@peff.net>
---
I prepared this on top of jk/commit-graph-leak-fixes, which is the
branch in v2.43.0 that introduced the regression.

 commit-graph.c   | 3 +++
 t/t5510-fetch.sh | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/commit-graph.c b/commit-graph.c
index e4d09da090..f26503295a 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -727,6 +727,9 @@ struct bloom_filter_settings *get_bloom_filter_settings(struct repository *r)

 void close_commit_graph(struct raw_object_store *o)
 {
+	if (!o->commit_graph)
+		return;
+
 	clear_commit_graph_data_slab(&commit_graph_data_slab);
 	free_commit_graph(o->commit_graph);
 	o->commit_graph = NULL;
diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
index 19c36b57f4..bcf524d549 100755
--- a/t/t5510-fetch.sh
+++ b/t/t5510-fetch.sh
@@ -802,7 +802,8 @@ test_expect_success 'fetch.writeCommitGraph with submodules' '
 		cd super-clone &&
 		rm -rf .git/objects/info &&
 		git -c fetch.writeCommitGraph=true fetch origin &&
-		test_path_is_file .git/objects/info/commit-graphs/commit-graph-chain
+		test_path_is_file .git/objects/info/commit-graphs/commit-graph-chain &&
+		git -c fetch.writeCommitGraph=true fetch --recurse-submodules origin
 	)
 '

-- 
2.43.0.514.g7147b80757

^ permalink raw reply related

* Re: rebase invoking pre-commit
From: Elijah Newren @ 2024-01-05  4:59 UTC (permalink / raw)
  To: Sean Allred; +Cc: Git List, Junio C Hamano
In-Reply-To: <m0sf3itvpy.fsf@epic96565.epic.com>

Hi,

On Sun, Dec 31, 2023 at 4:14 AM Sean Allred <allred.sean@gmail.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> > On Thu, Dec 21, 2023 at 12:59 PM Sean Allred <allred.sean@gmail.com> wrote:
> >> Is there a current reason why pre-commit shouldn't be invoked during
> >> rebase, or is this just waiting for a reviewable patch?
> >>
> >> This was brought up before at [1] in 2015, but that thread so old at
> >> this point that it seemed prudent to double-check before investing time
> >> in a developing and testing a patch.
> >>
> >> [1]: https://lore.kernel.org/git/1m55i3m.1fum4zo1fpnhncM%25lists@haller-berlin.de/
> >
> > I'm very opinionated here.  I'm just one person, so definitely take
> > this with a grain of salt, but in my view...
> >
> > Personally, I think implementing any per-commit hook in rebase by
> > default is a mistake. It enforces a must-be-in-a-worktree-and-the-
> > worktree-must-be-updated-with-every-replayed-commit mindset, which I
> > find highly problematic[2], even if that's "what we always used to
> > do".
> >
> > [2] https://lore.kernel.org/git/20231124111044.3426007-1-christian.couder@gmail.com/
>
> I'm not hip with what most pre-commit hooks do, but I'll point out that
> a hook like pre-commit assuming there is a worktree is the fault of the
> hook implementation, not of the infrastructure that invokes the hook. I
> imagine most folks on this list are aware that a worktree is not needed
> to create a commit and update a branch to point at it.

Well, since git-commit requires a worktree (cmd_commit in cmd_struct
in git.c has NEED_WORK_TREE in its options), and "pre-commit" is
likely to be inferred to be a "git-commit" thing (in fact, my reading
of the current text of the pre-commit hook in the githooks(1) manpage
seems to suggest that this hook is tied exclusively to git-commit),
assuming a working tree for "pre-commit" doesn't seem like a very far
leap at all.  In fact, I can't see why that assumption is broken given
our current documentation.

Perhaps we'd like to change the documentation and try to avoid such an
assumption...but even in cases where we've made explicit claims in the
past about various assumptions not being reliable, users have gone
ahead and made those assumptions anyway (Hyrum's Law), and then we
have sometimes been unable to make changes that broke those
assumptions.

So, I'm still concerned, especially if this is applied to every commit.

Granted, hooks are already pretty messed up.  We invoke the
post-commit hook -- IF we're using the merge backend (and don't invoke
it if we're using the apply backend), meaning it's not clear to users
whether the post-commit hook will be invoked or not in a rebase.
(Especially since while the backend can be specified directly, it's
often just selected based on other options that are only implemented
by one of the two backends).  I've put a lot of work into making the
rebase backends more consistent and would like them to be even more
so, but the history here is slightly troubling.  The interactive
backend was "fixed" once, accidentally, and then the accident was
realized and was treated as a bug and reverted
(https://lore.kernel.org/git/67a711754efce038914e8ec15c5dec4a5983566d.1571135132.git.gitgitgadget@gmail.com/),
leaving us again inconsistent.  At least we've documented the
inconsistency and the desire to change it
(https://lore.kernel.org/git/pull.749.v3.git.git.1586044818132.gitgitgadget@gmail.com/).

> FWIW, I would also find such a mindset to be highly problematic :-) I'll
> take a moment here to thank you, Christian, and everyone else in that
> effort for your interest in and work on git-replay; I've been trying to
> watch its activity on-list closely in the hopes that we can adopt it
> into our system once it's ready.

I'm glad others are interested in git-replay.  Sadly, the work on
pushing it forward stopped about a year and a half ago.  All work
since then was limited to pulling out the bits that were ready to be
upstreamed and cleaning those up.

> > Because of that, I would prefer to see this at most be a command line
> > flag. However, we've already got a command line flag that while not
> > identical, is essentially equivalent: "--exec $MY_SCRIPT" (it's not
> > the same because it's a post-commit check, but you get notification of
> > any problematic commits, and an immediate stop of the rebase for you
> > to fix up the problematic commit; fixing up the commit shouldn't be
> > problematic since you are, after all, already rebasing).
>
> Indeed, and an
>
>     --exec 'git hook run pre-commit || git reset --soft HEAD~'
>
> would probably get you farther. I can certainly see an argument for
> this, but from the perspective of designing a system for other
> developers to use, such a rebase would have to be triggered
> automatically (perhaps on pre-push).

Well, there is a pre-push hook...  ;-)

But yeah, it does defer the discovery of the issue for the developer,
which is kind of counter-productive.

> > I see Phillip already responded and suggested not running the
> > pre-commit hook with every commit, but only upon the first commit
> > after a "git rebase --continue".  That seems far more reasonable to me
> > than running on every commit...though even that worries me in regards
> > to what assumptions that entails about what is present in the working
> > tree.
>
> It's worth noting the context here is to prevent developers from
> committing conflict markers, so this would actually be exactly
> sufficient.

Ah, that makes sense what you want to do.  Totally fair.

> Invoking pre-commit at this time would also be consistent with the
> behaviors of prepare-commit-msg, commit-msg, and post-commit -- at least
> when I reword a commit during a rebase.
>
> However, post-commit is executed after each picked commit during a
> rebase

Only if using the merge backend, and even in that case I personally
don't think it should be executed; see the "Hooks" subsection of the
"Behavioral Differences" section of the git-rebase manpage.

> , so pre-commit there would also be consistent.

Or maybe consistently inconsistent (i.e. being consistent with the
inconsistency that exists with the post-commit hook between backends),
and diverging even further from the aspirational goal in the "Hooks"
documentation in the git-rebase manpage.

> > (For example, what about folks with large repositories, so large that
> > a branch switch or full checkout is extremely costly, and which could
> > benefit from resolving conflicts in a separate sparse-checkout
> > worktree, potentially much more sparse than their main checkout?
>
> As it happens, a single checkout of our source runs upwards of 2GB, so
> I'm exactly in the population you're describing :-)  The main reason
> we're moving to Git from SVN is that an SVN checkout can take upwards of
> an hour for us today -- even with some real shenanigans to make them go
> faster. On the Git side, we've also looked into (though I don't recall
> if we had much success with) narrowing the sparsity patterns to just the
> conflicts for conflict resolution workflows -- particularly when moving
> feature code between separate trunks. So I guess I'm also glad we
> weren't too far off in left field on that one! (As I recall, one of the
> main challenges we faced there was ensuring there was enough stuff
> 'still around' so that both binary and project references could resolve
> and folks could use that information to help resolve conflicts.
> Hopefully git-replay can be smart enough to allow some customization on
> that front. We found some success with feeding the list of conflicted
> files into some arbitrary logic that spat out the sparsity pattern to
> use.)

An hour?  For a mere 2 GB?  I mean, I know 2 GB isn't exactly tiny and
it's a pain to deal with...but an hour?  Do you have some directories
with an extraordinary number of files directly within them (i.e. not
multiple subdirectories deep, but a directory immediately containing a
huge number of files)?  Or some kind of network filesystem?  Or some
really slow hooks?

Anyway, I'm glad others are concerned with slow checkouts and branch
switches too.  And yeah, the idea was just that we make an initial
sparse checkout limited to the conflicted files, but users can use the
sparse-checkout command to widen as needed.  The bigger piece was
making sure the git code didn't defeat this by unnecessarily expanding
the index (currently with a sparse index, any conflicted files causes
the sparse-index to be expanded to a completely full index).

> > And what if people like that really fast rebase resolution (namely,
> > done in a separate very sparse checkout which also has the advantage
> > of not polluting your current working tree) so much that they use it
> > on smaller repositories as well? Can I not even experiment with this
> > idea because of the historical per-commit-at-least-as-full-as-main
> > -worktree-checkout assumptions we've baked into rebase?)
>
> I'd be interested in reading more about this baked-in assumption. Are
> these mostly laid out in replay-design-notes.txt[3]?

merge-recursive, our default merge backend for about 15 years (and
reused in rebase, cherry-pick, etc. too) only worked in a worktree and
muddied it as it went.  The other merge algorithms also assumed a
worktree was present.  When I wrote merge-ort and removed the
requirement for having a worktree, I naturally wanted to fix rebase
(and cherry-pick) to stop updating the working tree with every single
commit being rebased; it was such a needless waste of resources.  But
that assumption was all over the code and hard to remove.  And, I ran
into several items where it was unclear if "fixing" the code would be
deemed a backward compatibility break.  Hooks were one of the issues.

It's been quite a while since I've looked at it, but yes, I suspect
some of the issues are documented (explicitly or implicitly) in the
replay-design-notes.txt file, and you may see some as well in the
"Behavioral differences" section of the git-rebase manpage.

But there may well be additional parts that aren't documented, since I
never went through the full effort to de-worktree-ify rebase, and only
got git-replay doing some basic things.  So, at least part of the
issue is what other unknown assumptions are in the code which people
may have knowingly or unknowingly depended upon.

> > While at it, I should also mention that I'm not a fan of the broken
> > pre-rebase hook; its design ties us to doing a single branch at a
> > time.  Maybe that hook is not quite as bad, though, since we already
> > broke that hook and no one seemed to care (in particular, when
> > --update-refs was implemented).  But if no one seems to care about
> > broken hooks, I think the proper answer is to either get rid of the
> > hook or fix it.
>
> If I were to guess, this likely stems either from an inexact definition
> of the hook in documentation (ultimately resulting in incomplete tests)
> or folks incorrectly assuming what each hook should do based purely on
> its name.

No, neither is true.  The definition is very exact; the problem is
that the definition is excessively rigid.  It says, "The second
parameter is *the* branch being rebased" (emphasis added).  There is
no facility in the hook for rebasing more than one branch or
reference, and if we attempt to pass additional parameters, we're
potentially breaking all the existing pre-rebase hooks (which won't be
prepared to handle the extra parameters and thus may fail to reject
the rebase for those other branches that are being rebased, and thus
defeat the point of the hook).  Therefore, allowing rebase to operate
on multiple branches is a backward compatibility break...though one
that we overlooked/ignored when we added --update-refs.

And I want something more general than rebase's --update-refs; I want
the ability to replay multiple branches that aren't entirely contained
within each other, and perhaps only have a little common history (or
maybe even none) since the base point(s) we're replaying from.

> Which leads to an interesting point: pre-commit specifically states that
> it is invoked by git-commit -- not that it's invoked whenever a commit
> is created. So perhaps the correct thing to do here (if a hook is in
> fact needed) would be to define a new hook -- but I worry about doing
> that in the current state where there doesn't *seem* to be very rigid
> coordination of when client hooks are invoked in terms of plumbing
> rather than porcelain.

Yeah, as you're discovering, hooks are a mess.  Made all the more
messy by the fact that higher level commands traditionally started as
scripts that invoked other lower-level commands (git-rebase.sh
literally called git-cherry-pick and git-commit, or git-format-patch
and git-am, or git-merge-recursive and git-commit, or ...), and now
that we've discovered inconsistencies between backends, and
performance problems, and whatnot, it is not clear what we should or
even can do.  Also, even when we rewrote the scripts into C code,
we've often done so by reimplementing shell in C -- we just fork
various subprocesses repeatedly (yes, really), and then maybe come
along later and clean it up a little.

> > Anyway, as I mentioned, I'm quite opinionated here.  To the point that
> > I deemed git-rebase in its current form to possibly be unfixable
> > (after first putting a lot of work into improving it over the past
> > years) and eventually introduced a new "git-replay" command which I
> > hope to make into a competent replacement for it.  Given that I do
> > have another command to do my experiments, others on the list may
> > think it's fine to send rebase further down this route, but I was
> > hoping to avoid further drift so that there might be some way of
> > re-implementing rebase on top of my "git-replay" ideas/design.
>
> I appreciate your perspective; you've certainly thought a lot about this
> space -- and I definitely share your goal of consolidating
> implementations for obvious reasons.
>
> So I suppose that leaves me with four possible paths forward:
>
> 1. Pursue invoking pre-commit before each commit in `git rebase` (likely
>    generic in the sequencer) to be consistent with post-commit.
>
>    It sounds like this isn't a popular option, but I'm curious to folks'
>    thoughts on the noted behavior of post-commit here.

I've already said quite a bit about this above, but as a quick
reminder: (1) I'm not sure it makes sense to make something consistent
with something that is inconsistent, and (2) we have a documented
desire to stop calling post-commit from rebase even in the situations
where it is currently called.

> 2. Pursue invoking pre-commit on `git rebase --continue` (likely on any
>    --continue in the sequencer). This has the benefit of using existing
>    configuration on developer machines to purportedly 'do the right
>    thing' when its likely humans are touching code during conflict
>    resolution. It's worth noting this isn't the only reason you might
>    --continue, though, since the naive interpretation of this approach
>    completely ignores sequencer commands like 'break', though it could
>    probably just do what commit-msg does.

In the case of `break`, though, the commit was already transplanted,
so a `--continue` doesn't need to create one before moving on.  Thus,
we could say that only when rebase --continue has an unfinished commit
that needs to be created (i.e. we have a conflicted commit to resolve)
that the `pre-commit` hook gets called.

The idea of having an 'unfinished' commit state is also useful because
interactive rebases make it easy to forget if you're in the middle of
a break/edit or trying to resolve a conflict elsewhere.  That seems to
happen to me far too many times, and I really want `git commit
--amend` to throw an error (unless overridden with e.g. --force) when
you have an unfinished commit that you are working on by resolving
conflicts.  In fact, this state is probably already recorded
somewhere, we just need to make better use of it...but I'm going off
on a tangent now.

> 3. Define and implement a new hook that is called whenever a new commit
>    is about to be (or has been?) written. Such a hook could be
>    specifically designed to discourage assuming there's a working copy,
>    though we're kidding nobody by thinking it won't be used downstream
>    with that assumption. At least we could be explicit about
>    expectations, though.
>
>    This is *probably* a lot more design work than this little paragraph
>    lets on, but I've not personally watched the introduction of a new
>    hook so I don't have context for what to expect.

Yeah, and at least as worded, this suggestion would run afoul of the
logic for the distinction between pre-commit and pre-merge-commit.
And it seems to have all the same problems as #1, unless we restrict
it to just when running "rebase --continue", in which case it's not
clear what it buys us over #2.

> 4. Trigger a rebase --exec in our pre-push. This is certainly the least
>    work in git.git (i.e., no work at all), but it comes with the
>    distinct disadvantage of playing whiplash with the developer's focus.
>    During conflict resolution, they're thinking about conflicts. When
>    you're ready to push, its likely that you're no longer thinking about
>    conflicts.

Yeah, simple to implement but much less helpful.

> Does the behavior of post-commit here change any minds?

No...but although I'm slightly worried about solution #2 (as opposed
to very worried with #1), your usecase for pre-commit seems quite
reasonable.  So, I can get behind #2, though it'd be nice if we could
update the description of the `pre-commit` hook (so that it's clear
that it's not just "git-commit"), and while we're at it perhaps try to
be more explicit about what can and can't be assumed by the hook.

^ permalink raw reply

* Re: Does extending `--empty` to git-cherry-pick make sense?
From: Brian Lyles @ 2024-01-05  3:20 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Taylor Blau, git
In-Reply-To: <CABPp-BGJfvBhO_zEX8nLoa8WNsjmwvtZ2qOjmYm9iPoZg4SwPw@mail.gmail.com>

On Thu, Jan 4, 2024 at 8:28 PM Elijah Newren <newren@gmail.com> wrote:
>
> I was indeed focused on simply getting the multiple rebase backends to
> have consistent behavior (we had like 4-5 backends at the time,
> right?) and just didn't consider cherry-pick at the time.  Now that
> those are more consistent (though there's still work to be done on
> that end too), cherry-pick and rebase really ought to share a lot more
> between each other, from command line flags, to temporary control
> files written, to more shared code paths.  Adding an --empty flag to
> cherry-pick as a step along the way makes sense.

I appreciate the insight from both of you.

It sounds like I'll be diving into some C code for the first time in
over a decade, then! I'll plan to take a crack at this soon.

-Brian Lyles

^ permalink raw reply

* Re: [PATCH v2 2/2] fetch: add cli option --default-only
From: Eric Sunshine @ 2024-01-05  2:43 UTC (permalink / raw)
  To: Tamino Bauknecht; +Cc: git
In-Reply-To: <20240104222259.15659-2-dev@tb6.eu>

On Thu, Jan 4, 2024 at 5:23 PM Tamino Bauknecht <dev@tb6.eu> wrote:
> This option can be used to restore the default behavior of "git fetch"
> if the "fetch.all" config option is enabled.
> The flag cannot be used in combination with "--all" or explicit
> remote(s).
>
> Signed-off-by: Tamino Bauknecht <dev@tb6.eu>
> ---
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> @@ -2344,15 +2347,23 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
> +       if (all && default_only) {
> +               die(_("fetch --all does not work with fetch --default-only"));

To simplify the life of people who translate Git messages into other
languages, these days we have standard wording for this type of
message, and we extract the literal option from the message itself.
So, this should be:

    die(_("options '%s' and '%s' cannot be used together"),
        "--all", "--default-only");

> diff --git a/t/t5514-fetch-multiple.sh b/t/t5514-fetch-multiple.sh
> @@ -304,4 +304,45 @@ test_expect_success 'git config fetch.all false (fetch only default remote)' '
> +for fetch_all in true false
> +do
> +       test_expect_success "git fetch --default-only (fetch only default remote with fetch.all = $fetch_all)" '
> +               test_dir="test_default_only_$fetch_all" &&
> +               setup_test_clone "$test_dir" &&
> +               (
> +                       cd "$test_dir" &&
> +                       git config fetch.all $fetch_all &&
> +                       git fetch --default-only &&
> +                       cat >expect <<-\EOF &&
> +                         origin/HEAD -> origin/main
> +                         origin/main
> +                         origin/side
> +                       EOF
> +                       git branch -r >actual &&
> +                       test_cmp expect actual
> +               )
> +       '
> +done

Do you also want to test the case when "fetch.all" isn't set?

> +test_expect_success 'git fetch --all does not work with --default-only' '
> +       (
> +               cd test &&
> +               test_must_fail git fetch --all --default-only
> +       )
> +'

Minor point: This sort of test can be written more succinctly without
the subshell:

    test_expect_success 'git fetch --all does not work with --default-only' '
        test_must_fail git -C test fetch --all --default-only
    '

^ permalink raw reply

* Re: Does extending `--empty` to git-cherry-pick make sense?
From: Elijah Newren @ 2024-01-05  2:28 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Brian Lyles, git
In-Reply-To: <ZZcIG+mNXhZ0rHw3@nand.local>

On Thu, Jan 4, 2024 at 11:33 AM Taylor Blau <me@ttaylorr.com> wrote:
>
> [+cc Elijah]
>
> On Thu, Jan 04, 2024 at 12:57:18AM -0600, Brian Lyles wrote:
> > Is there any real barrier to exposing that option to git-cherry-pick as
> > well? Was this an oversight, or intentionally left out? The
> > corresponding commit message doesn't seem to indicate any specific
> > reason for limiting it to git-rebase.
>
> I am not nearly as familiar with this code as Elijah is, but this
> certainly appears possible by setting the `drop_redundant_commits` and
> `keep_redundant_commits` flags in the replay_opts struct.
>
> I don't see any fundamental reason why cherry-pick shouldn't have the
> same functionality.

I was indeed focused on simply getting the multiple rebase backends to
have consistent behavior (we had like 4-5 backends at the time,
right?) and just didn't consider cherry-pick at the time.  Now that
those are more consistent (though there's still work to be done on
that end too), cherry-pick and rebase really ought to share a lot more
between each other, from command line flags, to temporary control
files written, to more shared code paths.  Adding an --empty flag to
cherry-pick as a step along the way makes sense.

Note that git-am also gained a similar flag in the meantime, but
changed the names slightly: --empty=(stop,drop,keep).  I think 'stop'
is a better name than 'ask', and we really should make rebase suggest
'stop' instead (but keep 'ask' as a synonym for 'stop', for backwards
compatibility).  Also, I kind of want to replace 'keep' with 'roll' in
the --empty option for both git-am and git-rebase, while keeping
'keep' as a synonym for 'roll'.  But I'm not sure if others on the
list would go along with it...

^ permalink raw reply

* Re: [GSOC][RFC] Heed core.bare from template config file when no command line override given, as a microproject.
From: Elijah Newren @ 2024-01-05  2:11 UTC (permalink / raw)
  To: Christian Couder; +Cc: Ghanshyam Thakkar, git, gitster, johannes.schindelin
In-Reply-To: <CAP8UFD1wMJMY6G4SaPTPwq6b9HbeXG1kB97-RRrL-KGN1wE0rg@mail.gmail.com>

On Thu, Jan 4, 2024 at 2:24 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Tue, Jan 2, 2024 at 11:17 PM Ghanshyam Thakkar
> <shyamthakkar001@gmail.com> wrote:
> >
> > Hello,
> >
> > I'm currently an undergrad beginning my journey of contributing to the
> > Git project. I am seeking feedback on doing "Heed core.bare from
> > template config file when no command line override given" described
> > here
> > https://lore.kernel.org/git/5b39c530f2a0edf3b1492fa13a1132d622a0678e.1684218850.git.gitgitgadget@gmail.com/
> > by Elijah Newren, as a microproject. I would like to know from the
> > community, if the complexity and scope of the project is appropriate
> > for a microproject.
>
> Thanks for your interest in the next GSoC!
>
> My opinion is that it's too complex for a micro-project. Now maybe if
> Elijah or others are willing to help you on it, perhaps it will work
> out. I think it's safer to look at simpler micro-projects though.

An important part of solving this problem, if you were to do so, would
be adding several testcases (including some showing how it currently
fails).  Simply adding some or all of those testcases would be a good
micro-project.  (If you take this on, it'd probably be worthwhile to
include a reference to any such added tests in the TODO comment here
so that future folks noticing the TODO are aware of them).  Then, if
adding those testcases goes well and you feel ambitious, you can try
to tackle the underlying problem too.

> > e.g. in builtin/init-db.c :
> >
> > static int template_bare_config(const char *var, const char *value,
> >                      const struct config_context *ctx, void *cb)
> > {
> >        if(!strcmp(var,"core.bare")) {
>
> We like to have a space character between "if" and "(" as well as after a ","
>
> >              is_bare_repository_cfg = git_config_bool(var, value);
> >        }
> >        return 0;
> > }
> >
> > int cmd_init_db(int argc, const char **argv, const char *prefix)
> > {
> > ...
> > ...
> >        if(is_bare_repository_cfg==-1) {
>
> We like to have a space character both before and after "==" as well
> as between "if" and "(".
>
> >              if(!template_dir)
> >                    git_config_get_pathname("init.templateDir",
> >                                            &template_dir);
> >
> >              if(template_dir) {
> >                    const char* template_config_path
> >                                 = xstrfmt("%s/config",
> >                    struct stat st;
> >
> >                    if(!stat(template_config_path, &st) &&
> >                      !S_ISDIR(st.st_mode)) {
> >                          git_config_from_file(template_bare_cfg,
> >                                         template_config_path, NULL);
> >                    }
> >              }
> > ...
> > ...
> >        return init_db(git_dir, real_git_dir, template_dir, hash_algo,
> >                       initial_branch, init_shared_repository, flags);
> > }
> >
> > I also wanted to know if the global config files should have an effect
> > in deciding if the repo is bare or not.
> >
> > Curious to know your thoughts on, if this is the right approach or
> > does it require doing refactoring to bring all the logic in setup.c.
> > Based on your feedback, I can quickly send a patch.
>
> I don't know this area of the code well, so I don't think I can help
> you much on this.

If you look back at the mailing list discussion on the series that
introduced this TODO comment you are trying to address, you'll note
that both Glen and I dug into the code and attempted to explain it to
each other, and we both got it wrong on our first try.   If I knew the
correct solution without digging, I probably would have just
implemented it and sent it in as another patch series.  My TODO was
not meant to be a definitive guide about where to make the fixes
(because I don't know yet), but just a helpful guide that the first
spot you'd expect cannot be the correct location (I already tried a
few things there) and that you need to dig further.  Anyway, I'm sure
the correct place to fix can be figured out with a bit of work, but in
this case, figuring out where the changes need to be made is probably
the majority of the effort.

You may well need to just pick an area, start modifying, go through it
in a debugger and with your testcases, etc., and learn whether
modifications in that area even can solve the problem.  You may have
to discard your first or even second attempts, but take what you
learned to guide you on future attempts.

And if you do get it all working, this is a case where it'd likely be
important to comment in the commit message not just why you are making
changes, but why you believe the area you modified is the correct one
(i.e. to mention why the more obvious places to modify don't actually
solve the problem).  And then be prepared for folks on the mailing
list to use the information you provided in the commit message to dive
in and point out some other way to fix the problem that is even better
but requires you to redo it again.  (And I'm not just saying that
because you're new; I would not be surprised if the same happened to
me.  Others are more familiar with various parts of the general setup
and config code than me, and sometimes additional expertise coupled
with a solution of some sort can make it easier to identify an even
better solution.)

^ permalink raw reply

* [PATCH v3] rebase: clarify --reschedule-failed-exec default
From: Illia Bobyr @ 2024-01-05  1:14 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Illia Bobyr
In-Reply-To: <ZZcE/Kw24YKlqSOT@nand.local>

Documentation should mention the default behavior.

It is better to explain the persistent nature of the
--reschedule-failed-exec flag from the user standpoint, rather than from
the implementation standpoint.

Signed-off-by: Illia Bobyr <illia.bobyr@gmail.com>
---
 Documentation/git-rebase.txt | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git Documentation/git-rebase.txt Documentation/git-rebase.txt
index 1dd65..b6282 100644
--- Documentation/git-rebase.txt
+++ Documentation/git-rebase.txt
@@ -626,13 +626,16 @@ See also INCOMPATIBLE OPTIONS below.
 	Automatically reschedule `exec` commands that failed. This only makes
 	sense in interactive mode (or when an `--exec` option was provided).
 +
-Even though this option applies once a rebase is started, it's set for
-the whole rebase at the start based on either the
-`rebase.rescheduleFailedExec` configuration (see linkgit:git-config[1]
-or "CONFIGURATION" below) or whether this option is
-provided. Otherwise an explicit `--no-reschedule-failed-exec` at the
-start would be overridden by the presence of
-`rebase.rescheduleFailedExec=true` configuration.
+This option applies once a rebase is started. It is preserved for the whole
+rebase based on, in order, the command line option provided to the initial `git
+rebase`, the `rebase.rescheduleFailedExec` configuration (see
+linkgit:git-config[1] or "CONFIGURATION" below), or it defaults to false.
++
+Recording this option for the whole rebase is a convenience feature. Otherwise
+an explicit `--no-reschedule-failed-exec` at the start would be overridden by
+the presence of a `rebase.rescheduleFailedExec=true` configuration when `git
+rebase --continue` is invoked. Currently, you cannot pass
+`--[no-]reschedule-failed-exec` to `git rebase --continue`.
 
 --update-refs::
 --no-update-refs::
-- 
2.40.1


^ permalink raw reply

* [PATCH v3] rebase: clarify --reschedule-failed-exec default
From: Illia Bobyr @ 2024-01-05  1:14 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau
In-Reply-To: <ZZcE/Kw24YKlqSOT@nand.local>

Sorry, I did not actually include the change in v2.
Still learning how to use git send-email.

On Thu, Jan 04, 2024 at 11:20:28AM -0800, Taylor Blau wrote:
> [...]
>
> > +Recording this option for the whole rebase is a convenience feature. Otherwise
> > +an explicit `--no-reschedule-failed-exec` at the start would be overridden by
> > +the presence of a `rebase.rescheduleFailedExec=true` configuration when `git
> > +rebase --continue` is invoked. Currently, you can not, pass
> > +`--[no-]reschedule-failed-exec` to `git rebase --continue`.
> 
> The last sentence was a bit confusing to me. I assume you meant
> 
>     Currently, you cannot pass `--[no-]reschedule-failed-exec` [...]
> 
> without the comma between "pass" and "`--[no]reschedule-failed-exect`",
> and replacing "can not" with "cannot".

Applied.
Thank you for reviewing!

^ permalink raw reply

* Re: [PATCH] push: region_leave trace for negotiate_using_fetch
From: Sam Delmerico @ 2024-01-05  1:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, steadmon
In-Reply-To: <xmqqbka27zu9.fsf@gitster.g>

* I don't exactly remember how I noticed it. I was doing some
debugging around the push negotiation code and either 1) saw this
region get entered twice in the trace output, or 2) I was just reading
the code around here and saw two enters.

* Perhaps there could be a check before the last git process ends that
checks that all opened trace regions have been closed? I'm not sure
how much work this would involve. It's probably also not a very
proactive way to catch these bugs since it would only get triggered
when a *user* hits a code path with a trace region that never exits.

* There could also be a test that checks that every region_enter trace
log has a corresponding region_leave. But I'm not sure how to ensure
that every code path is checked.

Overall, I'm not sure how much benefit there is from checking for
this. I'm not sure that it would have a large impact if it were to
happen again. For example, I think that it could be noticed relatively
quickly by a person/system looking at metrics like I was (e.g. if the
time spent in a region is infinite or zero).

FWIW I didn't see any other examples of this when going through logs.

Sam
<br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed,
Jan 3, 2024 at 3:37 PM Junio C Hamano &lt;gitster@pobox.com&gt;
wrote:<br></div><blockquote class="gmail_quote" style="margin: 0px 0px
0px 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left:
1ex;">Sam Delmerico &lt;<a href="mailto:delmerico@google.com"
target="_blank">delmerico@google.com</a>&gt; writes:<br>
<br>
&gt; There were two region_enter events for negotiate_using_fetch instead of<br>
&gt; one enter and one leave. This commit replaces the second region_enter<br>
&gt; event with a region_leave.<br>
&gt;<br>
&gt; Signed-off-by: Sam Delmerico &lt;<a
href="mailto:delmerico@google.com"
target="_blank">delmerico@google.com</a>&gt;<br>
&gt; ---<br>
&gt;&nbsp; fetch-pack.c | 2 +-<br>
&gt;&nbsp; 1 file changed, 1 insertion(+), 1 deletion(-)<br>
<br>
Looks right, after skimming a29263cf (fetch-pack: add tracing for<br>
negotiation rounds, 2022-08-02).&nbsp; Two questions and a half.<br>
<br>
&nbsp;* How did you find it?&nbsp; Code inspection?&nbsp; While
writing a script to<br>
&nbsp; &nbsp;parse the output from around this area, your script noticed the<br>
&nbsp; &nbsp;ever-increasing nesting level?&nbsp; Something else?<br>
<br>
&nbsp;* Would it be feasible to write some tests or tools that find<br>
&nbsp; &nbsp;similar problems (semi-)automatically?<br>
<br>
&nbsp;* Is the breakage (before this patch) something easily demonstrated<br>
&nbsp; &nbsp;in a new test in t/ somewhere?&nbsp; And if so, would it
be worth<br>
&nbsp; &nbsp;doing?<br>
<br>
Thanks.&nbsp; Will queue.<br>
<br>
<br>
&gt;<br>
&gt; diff --git a/fetch-pack.c b/fetch-pack.c<br>
&gt; index 31a72d43de..dba6d79944 100644<br>
&gt; --- a/fetch-pack.c<br>
&gt; +++ b/fetch-pack.c<br>
&gt; @@ -2218,7 +2218,7 @@ void negotiate_using_fetch(const struct
oid_array *negotiation_tips,<br>
&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; the_repository, "%d",<br>
&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; negotiation_round);<br>
&gt;&nbsp; &nbsp; &nbsp; &nbsp;}<br>
&gt; -&nbsp; &nbsp; &nbsp;trace2_region_enter("fetch-pa<wbr>ck",
"negotiate_using_fetch", the_repository);<br>
&gt; +&nbsp; &nbsp; &nbsp;trace2_region_leave("fetch-pa<wbr>ck",
"negotiate_using_fetch", the_repository);<br>
&gt;&nbsp; &nbsp; &nbsp;
&nbsp;trace2_data_intmax("<wbr>negotiate_using_fetch",
the_repository,<br>
&gt;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; "total_rounds", negotiation_round);<br>
&gt;&nbsp; &nbsp; &nbsp; &nbsp;clear_common_flag(acked_commi<wbr>ts);<br>
&gt;<br>
&gt; base-commit: a26002b62827b89a19b1084bd75d93<wbr>71d565d03c<br>
</blockquote></div>

^ permalink raw reply

* Re: [PATCH v2 1/2] fetch: add new config option fetch.all
From: Eric Sunshine @ 2024-01-05  1:02 UTC (permalink / raw)
  To: Tamino Bauknecht; +Cc: git
In-Reply-To: <20240104222259.15659-1-dev@tb6.eu>

On Thu, Jan 4, 2024 at 5:23 PM Tamino Bauknecht <dev@tb6.eu> wrote:
> This commit introduces the new boolean configuration option fetch.all
> which allows to fetch all available remotes by default. The config
> option can be overridden by explicitly specifying a remote.
> The behavior for --all is unchanged and calling git-fetch with --all and
> a remote will still result in an error.
>
> The option was also added to the config documentation and new tests
> cover the expected behavior.
>
> Signed-off-by: Tamino Bauknecht <dev@tb6.eu>
> ---
> I fixed the formatting in the test cases and replaced the "cp" with an
> explicit "expect".

Thanks, looks better. More below...

> diff --git a/t/t5514-fetch-multiple.sh b/t/t5514-fetch-multiple.sh
> @@ -24,6 +24,15 @@ setup_repository () {
> +setup_test_clone () {
> +       test_dir="$1"
> +       git clone one "$test_dir"
> +       for r in one two three
> +       do
> +               git -C "$test_dir" remote add "$r" "../$r" || return 1
> +       done
> +}

I wasn't paying attention to this function in the previous round, but
it ought to be made more robust. If the `clone` operation fails, we
want to know about it right away rather than finding out about it "by
accident" when the subsequent `remote add` operation fails. In other
words, you should maintain an unbroken &&-chain, not just in test
bodies, but also in functions which are called from within test
bodies. So, this should really be:

    setup_test_clone () {
        test_dir="$1" &&
        git clone one "$test_dir" &&
        for r in one two three
        do
            git -C "$test_dir" remote add "$r" "../$r" || return 1
        done
    }

> @@ -209,4 +218,90 @@ test_expect_success 'git fetch --multiple --jobs=0 picks a default' '
> +for fetch_all in true false
> +do
> +       test_expect_success "git fetch --all (works with fetch.all = $fetch_all)" '
> +               test_dir="test_fetch_all_$fetch_all" &&
> +               setup_test_clone "$test_dir" &&
> +               (
> +                       cd "$test_dir" &&
> +                       git config fetch.all $fetch_all &&
> +                       git fetch --all &&
> +                       cat >expect <<-\EOF &&
> +                         ...
> +                       EOF
> +                       git branch -r >actual &&
> +                       test_cmp expect actual
> +               )
> +       '
> +done
> +
> +test_expect_success 'git fetch (fetch all remotes with fetch.all = true)' '
> +       setup_test_clone test9 &&
> +       (
> +               cd test9 &&
> +               git config fetch.all true &&
> +               git fetch --all &&
> +               git branch -r >actual &&
> +               cat >expect <<-\EOF &&
> +                 ...
> +               EOF
> +               test_cmp expect actual
> +       )
> +'

I'm probably overlooking something, but isn't this testing the exact
same thing as was tested in the "true" case of the loop just above?

Or maybe there is a bug in this test and you meant `git fetch` rather
than `git fetch --all`?

^ permalink raw reply

* [PATCH v2] rebase: clarify --reschedule-failed-exec default
From: Illia Bobyr @ 2024-01-05  0:42 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau, Illia Bobyr
In-Reply-To: <20240105004246.1317445-1-illia.bobyr@gmail.com>

Documentation should mention the default behavior.

It is better to explain the persistent nature of the
--reschedule-failed-exec flag from the user standpoint, rather than from
the implementation standpoint.

Signed-off-by: Illia Bobyr <illia.bobyr@gmail.com>
---
 Documentation/git-rebase.txt | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git Documentation/git-rebase.txt Documentation/git-rebase.txt
index 1dd65..45d3c 100644
--- Documentation/git-rebase.txt
+++ Documentation/git-rebase.txt
@@ -626,13 +626,16 @@ See also INCOMPATIBLE OPTIONS below.
 	Automatically reschedule `exec` commands that failed. This only makes
 	sense in interactive mode (or when an `--exec` option was provided).
 +
-Even though this option applies once a rebase is started, it's set for
-the whole rebase at the start based on either the
-`rebase.rescheduleFailedExec` configuration (see linkgit:git-config[1]
-or "CONFIGURATION" below) or whether this option is
-provided. Otherwise an explicit `--no-reschedule-failed-exec` at the
-start would be overridden by the presence of
-`rebase.rescheduleFailedExec=true` configuration.
+This option applies once a rebase is started. It is preserved for the whole
+rebase based on, in order, the command line option provided to the initial `git
+rebase`, the `rebase.rescheduleFailedExec` configuration (see
+linkgit:git-config[1] or "CONFIGURATION" below), or it defaults to false.
++
+Recording this option for the whole rebase is a convenience feature. Otherwise
+an explicit `--no-reschedule-failed-exec` at the start would be overridden by
+the presence of a `rebase.rescheduleFailedExec=true` configuration when `git
+rebase --continue` is invoked. Currently, you can not, pass
+`--[no-]reschedule-failed-exec` to `git rebase --continue`.
 
 --update-refs::
 --no-update-refs::
-- 
2.40.1


^ permalink raw reply

* [PATCH v2] rebase: clarify --reschedule-failed-exec default
From: Illia Bobyr @ 2024-01-05  0:42 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau
In-Reply-To: <ZZcE/Kw24YKlqSOT@nand.local>

Applied.
Thank you for reviewing!



^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Junio C Hamano @ 2024-01-04 23:59 UTC (permalink / raw)
  To: Karthik Nayak; +Cc: Patrick Steinhardt, Taylor Blau, git, christian.couder
In-Reply-To: <CAOLa=ZStP6F1njTeoQZwN58k+_0r9LT7z-wg2819FWZPq90wQg@mail.gmail.com>

Karthik Nayak <karthik.188@gmail.com> writes:

> Thanks all for the discussion, I'll try to summarize the path forward
> as per my understanding.

It has already been clear for the past 5 years or so since 3a3b9d8c
(refs: new ref types to make per-worktree refs visible to all
worktrees, 2018-10-21) that we need to treat "worktrees/foo/HEAD",
"worktrees/bar/refs/bisect/bad", etc. as something end-users can
access via the normal ref mechansim (by a resolve_ref() call that is
eventually made from get_sha1() when these are passed to say "git
log"); I just did not remember that one but that does not mean we
can suddenly change the rules.

So you'd probably need to tweak the end of your bullet point #3, but
other than that it is a great summary.

Thanks.

> 1. We want to introduce a way to output all refs. This includes refs
> under "refs/", pseudo refs, HEAD, and any other ref like objects under
> $GIT_DIR. The reasoning being that users are allowed currently to create
> refs without any directory restrictions. So with the upcoming reftable
> backend, it becomes important to provide a utility to print all the refs
> held in the reftable. Ideally we want to restrict such ref's from being
> created but for the time being, since thats allowed, we should also
> provide the utility to print such refs.
>
> 2. In the files backend, this would involve iterating through the
> $GIT_DIR and finding all ref-like objects and printing them.
>
> 3. To expose this to the user, we could do something like
>
>    $ git for-each-ref ""
>
> Which is a natural extension of the current syntax, where the empty
> string would imply that we do not filter to the "refs/" directory.
> It is still not clear whether we should support "worktrees".
>
> Any corrections here?

^ permalink raw reply

* Re: Feasibility of folding `unit-tests` into `make test`, was Re: [PATCH] ci: avoid running the test suite _twice_
From: Josh Steadmon @ 2024-01-04 23:54 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff King, Johannes Schindelin via GitGitGadget, Phillip Wood,
	git
In-Reply-To: <850ea42c-f103-68d5-896b-9120e2628686@gmx.de>

On 2023.11.16 09:42, Johannes Schindelin wrote:
> Hi Josh,
> 
> On Wed, 15 Nov 2023, Josh Steadmon wrote:

[snip]

> > If I was forced to pick a way to get everything under one process, I'd
> > lean towards autogenerating individual shell script wrappers for each
> > unit test. But I'm open to discussion, especially if people have other
> > approaches I haven't thought of.
> 
> One alternative would be to avoid running the unit tests via `prove` in
> the first place.
> 
> For example, we could use the helper from be5d88e11280 (test-tool
> run-command: learn to run (parts of) the testsuite, 2019-10-04) [*1*]. It
> would probably need a few improvements, but certainly no wizardry nor
> witchcraft would be required. It would also help on Windows, where running
> a simple test helper written in C is vastly faster than running a complex
> Perl script (which `prove` is).
> 
> Ciao,
> Johannes
> 
> Footnote *1*: I had always wanted to improve that test helper to the point
> where it could replace our use of `prove`, at least on Windows. It seems,
> however, that as of 4c2c38e800f3 (ci: modification of main.yml to use
> cmake for vs-build job, 2020-06-26) we do not use the helper at all
> anymore. Hopefully it can still be useful. 🤞

Sorry for the silence on this topic; the holidays plus some family
illnesses kept me away from the list for a while. I have a working
implementation of this. I plan on cleaning it up a bit and sending it as
an RFC series either tomorrow or next week. Thank you for the
suggestion!

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox