Git development

Git development
 help / color / mirror / Atom feed

* [PATCH 08/10] t5401: speed up creation of many branches
From: Patrick Steinhardt @ 2023-11-29  7:25 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1701242407.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1610 bytes --]

One of the tests in t5401 creates a bunch of branches by calling
git-branch(1) for every one of them. This is quite inefficient and takes
a comparatively long time even on Unix systems where spawning processes
is comparatively fast. Refactor it to instead use git-update-ref(1),
which leads to an almost 10-fold speedup:

```
Benchmark 1: ./t5401-update-hooks.sh (rev = HEAD)
  Time (mean ± σ):     983.2 ms ±  97.6 ms    [User: 328.8 ms, System: 679.2 ms]
  Range (min … max):   882.9 ms … 1078.0 ms    3 runs

Benchmark 2: ./t5401-update-hooks.sh (rev = HEAD~)
  Time (mean ± σ):      9.312 s ±  0.398 s    [User: 2.766 s, System: 6.617 s]
  Range (min … max):    8.885 s …  9.674 s    3 runs

Summary
  ./t5401-update-hooks.sh (rev = HEAD) ran
    9.47 ± 1.02 times faster than ./t5401-update-hooks.sh (rev = HEAD~)
```

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t5401-update-hooks.sh | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/t/t5401-update-hooks.sh b/t/t5401-update-hooks.sh
index 001b7a17ad..8b8bc47dc0 100755
--- a/t/t5401-update-hooks.sh
+++ b/t/t5401-update-hooks.sh
@@ -133,10 +133,8 @@ test_expect_success 'pre-receive hook that forgets to read its input' '
 	EOF
 	rm -f victim.git/hooks/update victim.git/hooks/post-update &&
 
-	for v in $(test_seq 100 999)
-	do
-		git branch branch_$v main || return
-	done &&
+	printf "create refs/heads/branch_%d main\n" $(test_seq 100 999) >input &&
+	git update-ref --stdin <input &&
 	git push ./victim.git "+refs/heads/*:refs/heads/*"
 '
 
-- 
2.43.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 09/10] t5551: stop writing packed-refs directly
From: Patrick Steinhardt @ 2023-11-29  7:25 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1701242407.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1295 bytes --]

We have multiple tests in t5551 that write thousands of tags. To do so
efficiently we generate the tags by writing the `packed-refs` file
directly, which of course assumes that the reference database is backed
by the files backend.

Refactor the code to instead use a single `git update-ref --stdin`
command to write the tags. While the on-disk end result is not the same
as we now have a bunch of loose refs instead of a single packed-refs
file, the distinction shouldn't really matter for any of the tests that
use this helper.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t5551-http-fetch-smart.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 8a41adf1e1..e069737b80 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -359,7 +359,9 @@ create_tags () {

 	# now assign tags to all the dangling commits we created above
 	tag=$(perl -e "print \"bla\" x 30") &&
-	sed -e "s|^:\([^ ]*\) \(.*\)$|\2 refs/tags/$tag-\1|" <marks >>packed-refs
+	sed -e "s|^:\([^ ]*\) \(.*\)$|create refs/tags/$tag-\1 \2|" <marks >input &&
+	git update-ref --stdin <input &&
+	rm input
 }

 test_expect_success 'create 2,000 tags in the repo' '
-- 
2.43.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 10/10] t6301: write invalid object ID via `test-tool ref-store`
From: Patrick Steinhardt @ 2023-11-29  7:25 UTC (permalink / raw)
  To: git
In-Reply-To: <cover.1701242407.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2605 bytes --]

One of the tests in t6301 verifies that the reference backend correctly
warns about the case where a reference points to a non-existent object.
This is done by writing the object ID into the loose reference directly,
which is quite intimate with how the files backend works.

Refactor the code to instead use `test-tool ref-store` to write the
reference, which is backend-agnostic.

There are two more tests in this file which write loose files directly,
as well. But both of them are indeed quite specific to the loose files
backend and cannot be easily ported to other backends. We thus mark them
as requiring the REFFILES prerequisite.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t6301-for-each-ref-errors.sh | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/t/t6301-for-each-ref-errors.sh b/t/t6301-for-each-ref-errors.sh
index 2667dd13fe..83b8a19d94 100755
--- a/t/t6301-for-each-ref-errors.sh
+++ b/t/t6301-for-each-ref-errors.sh
@@ -15,7 +15,7 @@ test_expect_success setup '
 	git for-each-ref --format="%(objectname) %(refname)" >brief-list
 '
 
-test_expect_success 'Broken refs are reported correctly' '
+test_expect_success REFFILES 'Broken refs are reported correctly' '
 	r=refs/heads/bogus &&
 	: >.git/$r &&
 	test_when_finished "rm -f .git/$r" &&
@@ -25,7 +25,7 @@ test_expect_success 'Broken refs are reported correctly' '
 	test_cmp broken-err err
 '
 
-test_expect_success 'NULL_SHA1 refs are reported correctly' '
+test_expect_success REFFILES 'NULL_SHA1 refs are reported correctly' '
 	r=refs/heads/zeros &&
 	echo $ZEROS >.git/$r &&
 	test_when_finished "rm -f .git/$r" &&
@@ -39,15 +39,14 @@ test_expect_success 'NULL_SHA1 refs are reported correctly' '
 '
 
 test_expect_success 'Missing objects are reported correctly' '
-	r=refs/heads/missing &&
-	echo $MISSING >.git/$r &&
-	test_when_finished "rm -f .git/$r" &&
-	echo "fatal: missing object $MISSING for $r" >missing-err &&
+	test_when_finished "git update-ref -d refs/heads/missing" &&
+	test-tool ref-store main update-ref msg refs/heads/missing "$MISSING" "$ZERO_OID" REF_SKIP_OID_VERIFICATION &&
+	echo "fatal: missing object $MISSING for refs/heads/missing" >missing-err &&
 	test_must_fail git for-each-ref 2>err &&
 	test_cmp missing-err err &&
 	(
 		cat brief-list &&
-		echo "$MISSING $r"
+		echo "$MISSING refs/heads/missing"
 	) | sort -k 2 >missing-brief-expected &&
 	git for-each-ref --format="%(objectname) %(refname)" >brief-out 2>brief-err &&
 	test_cmp missing-brief-expected brief-out &&
-- 
2.43.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 0/4] refs: improve handling of special refs
From: Patrick Steinhardt @ 2023-11-29  8:14 UTC (permalink / raw)
  To: git; +Cc: hanwenn

[-- Attachment #1: Type: text/plain, Size: 2305 bytes --]

Hi,

there are a bunch of "special" refs in Git that sometimes behave like a
normal reference and sometimes they don't. These references are written
to directly via the filesystem without going through the reference
backend, but the expectation is that those references can then be read
by things like git-rev-parse(1).

We do not currently have a single source of truth for what those special
refs are, and we also don't have clear rules for how they should be
written to. This works in the context of the files backend, because it
is able to read back such manually-written loose references just fine.
But once the reftable backend lands this will stop working.

This patch series tries to improve this state by doing two things:

    1. We explicitly mark these references as special by introducing a
       new `is_special_ref()` function. This serves as documentation,
       but will also cause us to explicitly read all of these special
       refs via loose files regardless of the actual backend.

    2. We document a new rule around writing refs. Namely, normal
       references are _always_ written via the reference backend,
       whereas special references are _always_ written directly via the
       filesystem. This rule is not enforced anywhere, but at least it's
       now made more explicit.

The last patch fixes one of the instances where we treat a reference
inconsistently by converting it to a normal reference. We can eventually
migrate more of the special refs to become normal refs as we deem fit,
but I consider this to be out of scope for this patch series.

These patches improve compatibility with the new reftable backend.

Patrick

Patrick Steinhardt (4):
  wt-status: read HEAD and ORIG_HEAD via the refdb
  refs: propagate errno when reading special refs fails
  refs: complete list of special refs
  bisect: consistently write BISECT_EXPECTED_REV via the refdb

 bisect.c                    | 25 +++------------
 builtin/bisect.c            |  8 ++---
 refs.c                      | 64 +++++++++++++++++++++++++++++++++++--
 t/t1403-show-ref.sh         |  9 ++++++
 t/t6030-bisect-porcelain.sh |  2 +-
 wt-status.c                 | 17 +++++-----
 6 files changed, 86 insertions(+), 39 deletions(-)

-- 
2.43.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH 1/4] wt-status: read HEAD and ORIG_HEAD via the refdb
From: Patrick Steinhardt @ 2023-11-29  8:14 UTC (permalink / raw)
  To: git; +Cc: hanwenn
In-Reply-To: <cover.1701243201.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

We read both the HEAD and ORIG_HEAD references directly from the
filesystem in order to figure out whether we're currently splitting a
commit. If both of the following are true:

  - HEAD points to the same object as "rebase-merge/amend".

  - ORIG_HEAD points to the same object as "rebase-merge/orig-head".

Then we are currently splitting commits.

The current code only works by chance because we only have a single
reference backend implementation. Refactor it to instead read both refs
via the refdb layer so that we'll also be compatible with alternate
reference backends.

Note that we pass `RESOLVE_REF_NO_RECURSE` to `read_ref_full()`. This is
because we didn't resolve symrefs before either, and in practice none of
the refs in "rebase-merge/" would be symbolic. We thus don't want to
resolve symrefs with the new code either to retain the old behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 wt-status.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 9f45bf6949..fe9e590b80 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1295,26 +1295,27 @@ static char *read_line_from_git_path(const char *filename)
 static int split_commit_in_progress(struct wt_status *s)
 {
 	int split_in_progress = 0;
-	char *head, *orig_head, *rebase_amend, *rebase_orig_head;
+	struct object_id head_oid, orig_head_oid;
+	char *rebase_amend, *rebase_orig_head;
 
 	if ((!s->amend && !s->nowarn && !s->workdir_dirty) ||
 	    !s->branch || strcmp(s->branch, "HEAD"))
 		return 0;
 
-	head = read_line_from_git_path("HEAD");
-	orig_head = read_line_from_git_path("ORIG_HEAD");
+	if (read_ref_full("HEAD", RESOLVE_REF_NO_RECURSE, &head_oid, NULL) ||
+	    read_ref_full("ORIG_HEAD", RESOLVE_REF_NO_RECURSE, &orig_head_oid, NULL))
+		return 0;
+
 	rebase_amend = read_line_from_git_path("rebase-merge/amend");
 	rebase_orig_head = read_line_from_git_path("rebase-merge/orig-head");
 
-	if (!head || !orig_head || !rebase_amend || !rebase_orig_head)
+	if (!rebase_amend || !rebase_orig_head)
 		; /* fall through, no split in progress */
 	else if (!strcmp(rebase_amend, rebase_orig_head))
-		split_in_progress = !!strcmp(head, rebase_amend);
-	else if (strcmp(orig_head, rebase_orig_head))
+		split_in_progress = !!strcmp(oid_to_hex(&head_oid), rebase_amend);
+	else if (strcmp(oid_to_hex(&orig_head_oid), rebase_orig_head))
 		split_in_progress = 1;
 
-	free(head);
-	free(orig_head);
 	free(rebase_amend);
 	free(rebase_orig_head);
 
-- 
2.43.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 2/4] refs: propagate errno when reading special refs fails
From: Patrick Steinhardt @ 2023-11-29  8:14 UTC (permalink / raw)
  To: git; +Cc: hanwenn
In-Reply-To: <cover.1701243201.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 1755 bytes --]

Some refs in Git are more special than others due to reasons explained
in the next commit. These refs are read via `refs_read_special_head()`,
but this function doesn't behave the same as when we try to read a
normal ref. Most importantly, we do not propagate `failure_errno` in the
case where the reference does not exist, which is behaviour that we rely
on in many parts of Git.

Fix this bug by propagating errno when `strbuf_read_file()` fails.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c              | 6 +++++-
 t/t1403-show-ref.sh | 9 +++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index fcae5dddc6..7d4a057f36 100644
--- a/refs.c
+++ b/refs.c
@@ -1806,8 +1806,12 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	int result = -1;
 	strbuf_addf(&full_path, "%s/%s", ref_store->gitdir, refname);
 
-	if (strbuf_read_file(&content, full_path.buf, 0) < 0)
+	errno = 0;
+
+	if (strbuf_read_file(&content, full_path.buf, 0) < 0) {
+		*failure_errno = errno;
 		goto done;
+	}
 
 	result = parse_loose_ref_contents(content.buf, oid, referent, type,
 					  failure_errno);
diff --git a/t/t1403-show-ref.sh b/t/t1403-show-ref.sh
index b50ae6fcf1..37af154d27 100755
--- a/t/t1403-show-ref.sh
+++ b/t/t1403-show-ref.sh
@@ -266,4 +266,13 @@ test_expect_success '--exists with directory fails with generic error' '
 	test_cmp expect err
 '
 
+test_expect_success '--exists with non-existent special ref' '
+	test_expect_code 2 git show-ref --exists FETCH_HEAD
+'
+
+test_expect_success '--exists with existing special ref' '
+	git rev-parse HEAD >.git/FETCH_HEAD &&
+	git show-ref --exists FETCH_HEAD
+'
+
 test_done
-- 
2.43.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 3/4] refs: complete list of special refs
From: Patrick Steinhardt @ 2023-11-29  8:14 UTC (permalink / raw)
  To: git; +Cc: hanwenn
In-Reply-To: <cover.1701243201.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 4174 bytes --]

We have some references that are more special than others. The reason
for them being special is that they either do not follow the usual
format of references, or that they are written to the filesystem
directly by the respective owning subsystem and thus circumvent the
reference backend.

This works perfectly fine right now because the reffiles backend will
know how to read those refs just fine. But with the prospect of gaining
a new reference backend implementation we need to be a lot more careful
here:

  - We need to make sure that we are consistent about how those refs are
    written. They must either always be written via the filesystem, or
    they must always be written via the reference backend. Any mixture
    will lead to inconsistent state.

  - We need to make sure that such special refs are always handled
    specially when reading them.

We're already mostly good with regard to the first item, except for
`BISECT_EXPECTED_REV` which will be addressed in a subsequent commit.
But the current list of special refs is missing a lot of refs that
really should be treated specially. Right now, we only treat
`FETCH_HEAD` and `MERGE_HEAD` specially here.

Introduce a new function `is_special_ref()` that contains all current
instances of special refs to fix the reading path.

Based-on-patch-by: Han-Wen Nienhuys <hanwenn@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 refs.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 56 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 7d4a057f36..2d39d3fe80 100644
--- a/refs.c
+++ b/refs.c
@@ -1822,15 +1822,69 @@ static int refs_read_special_head(struct ref_store *ref_store,
 	return result;
 }
 
+static int is_special_ref(const char *refname)
+{
+	/*
+	 * Special references get written and read directly via the filesystem
+	 * by the subsystems that create them. Thus, they must not go through
+	 * the reference backend but must instead be read directly. It is
+	 * arguable whether this behaviour is sensible, or whether it's simply
+	 * a leaky abstraction enabled by us only having a single reference
+	 * backend implementation. But at least for a subset of references it
+	 * indeed does make sense to treat them specially:
+	 *
+	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
+	 *   carries additional metadata like where it came from.
+	 *
+	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
+	 *   heads.
+	 *
+	 * - "rebase-apply/" and "rebase-merge/" contain all of the state for
+	 *   rebases, where keeping it closely together feels sensible.
+	 *
+	 * There are some exceptions that you might expect to see on this list
+	 * but which are handled exclusively via the reference backend:
+	 *
+	 * - CHERRY_PICK_HEAD
+	 * - HEAD
+	 * - ORIG_HEAD
+	 *
+	 * Writing or deleting references must consistently go either through
+	 * the filesystem (special refs) or through the reference backend
+	 * (normal ones).
+	 */
+	const char * const special_refs[] = {
+		"AUTO_MERGE",
+		"BISECT_EXPECTED_REV",
+		"FETCH_HEAD",
+		"MERGE_AUTOSTASH",
+		"MERGE_HEAD",
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
+		if (!strcmp(refname, special_refs[i]))
+			return 1;
+
+	/*
+	 * git-rebase(1) stores its state in `rebase-apply/` or
+	 * `rebase-merge/`, including various reference-like bits.
+	 */
+	if (starts_with(refname, "rebase-apply/") ||
+	    starts_with(refname, "rebase-merge/"))
+		return 1;
+
+	return 0;
+}
+
 int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
 		      struct object_id *oid, struct strbuf *referent,
 		      unsigned int *type, int *failure_errno)
 {
 	assert(failure_errno);
-	if (!strcmp(refname, "FETCH_HEAD") || !strcmp(refname, "MERGE_HEAD")) {
+	if (is_special_ref(refname))
 		return refs_read_special_head(ref_store, refname, oid, referent,
 					      type, failure_errno);
-	}
 
 	return ref_store->be->read_raw_ref(ref_store, refname, oid, referent,
 					   type, failure_errno);
-- 
2.43.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* [PATCH 4/4] bisect: consistently write BISECT_EXPECTED_REV via the refdb
From: Patrick Steinhardt @ 2023-11-29  8:14 UTC (permalink / raw)
  To: git; +Cc: hanwenn
In-Reply-To: <cover.1701243201.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 5645 bytes --]

We're inconsistently writing BISECT_EXPECTED_REV both via the filesystem
and via the refdb, which violates the newly established rules for how
special refs must be treated. This works alright in practice with the
reffiles reference backend, but will cause bugs once we gain additional
backends.

Fix this issue and consistently write BISECT_EXPECTED_REV via the refdb
so that it is no longer a special ref.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 bisect.c                    | 25 ++++---------------------
 builtin/bisect.c            |  8 ++------
 refs.c                      |  2 +-
 t/t6030-bisect-porcelain.sh |  2 +-
 4 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/bisect.c b/bisect.c
index 1be8e0a271..985b96ed13 100644
--- a/bisect.c
+++ b/bisect.c
@@ -471,7 +471,6 @@ static int read_bisect_refs(void)
 }
 
 static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES")
-static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")
 static GIT_PATH_FUNC(git_path_bisect_ancestors_ok, "BISECT_ANCESTORS_OK")
 static GIT_PATH_FUNC(git_path_bisect_run, "BISECT_RUN")
 static GIT_PATH_FUNC(git_path_bisect_start, "BISECT_START")
@@ -707,26 +706,10 @@ static enum bisect_error error_if_skipped_commits(struct commit_list *tried,
 
 static int is_expected_rev(const struct object_id *oid)
 {
-	const char *filename = git_path_bisect_expected_rev();
-	struct stat st;
-	struct strbuf str = STRBUF_INIT;
-	FILE *fp;
-	int res = 0;
-
-	if (stat(filename, &st) || !S_ISREG(st.st_mode))
+	struct object_id expected_oid;
+	if (read_ref("BISECT_EXPECTED_REV", &expected_oid))
 		return 0;
-
-	fp = fopen_or_warn(filename, "r");
-	if (!fp)
-		return 0;
-
-	if (strbuf_getline_lf(&str, fp) != EOF)
-		res = !strcmp(str.buf, oid_to_hex(oid));
-
-	strbuf_release(&str);
-	fclose(fp);
-
-	return res;
+	return oideq(oid, &expected_oid);
 }
 
 enum bisect_error bisect_checkout(const struct object_id *bisect_rev,
@@ -1185,10 +1168,10 @@ int bisect_clean_state(void)
 	struct string_list refs_for_removal = STRING_LIST_INIT_NODUP;
 	for_each_ref_in("refs/bisect", mark_for_removal, (void *) &refs_for_removal);
 	string_list_append(&refs_for_removal, xstrdup("BISECT_HEAD"));
+	string_list_append(&refs_for_removal, xstrdup("BISECT_EXPECTED_REV"));
 	result = delete_refs("bisect: remove", &refs_for_removal, REF_NO_DEREF);
 	refs_for_removal.strdup_strings = 1;
 	string_list_clear(&refs_for_removal, 0);
-	unlink_or_warn(git_path_bisect_expected_rev());
 	unlink_or_warn(git_path_bisect_ancestors_ok());
 	unlink_or_warn(git_path_bisect_log());
 	unlink_or_warn(git_path_bisect_names());
diff --git a/builtin/bisect.c b/builtin/bisect.c
index 35938b05fd..4e2c43daf5 100644
--- a/builtin/bisect.c
+++ b/builtin/bisect.c
@@ -17,7 +17,6 @@
 #include "revision.h"
 
 static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
-static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")
 static GIT_PATH_FUNC(git_path_bisect_ancestors_ok, "BISECT_ANCESTORS_OK")
 static GIT_PATH_FUNC(git_path_bisect_start, "BISECT_START")
 static GIT_PATH_FUNC(git_path_bisect_log, "BISECT_LOG")
@@ -921,7 +920,6 @@ static enum bisect_error bisect_state(struct bisect_terms *terms, int argc,
 	const char *state;
 	int i, verify_expected = 1;
 	struct object_id oid, expected;
-	struct strbuf buf = STRBUF_INIT;
 	struct oid_array revs = OID_ARRAY_INIT;
 
 	if (!argc)
@@ -976,10 +974,8 @@ static enum bisect_error bisect_state(struct bisect_terms *terms, int argc,
 		oid_array_append(&revs, &commit->object.oid);
 	}
 
-	if (strbuf_read_file(&buf, git_path_bisect_expected_rev(), 0) < the_hash_algo->hexsz ||
-	    get_oid_hex(buf.buf, &expected) < 0)
+	if (read_ref("BISECT_EXPECTED_REV", &expected))
 		verify_expected = 0; /* Ignore invalid file contents */
-	strbuf_release(&buf);
 
 	for (i = 0; i < revs.nr; i++) {
 		if (bisect_write(state, oid_to_hex(&revs.oid[i]), terms, 0)) {
@@ -988,7 +984,7 @@ static enum bisect_error bisect_state(struct bisect_terms *terms, int argc,
 		}
 		if (verify_expected && !oideq(&revs.oid[i], &expected)) {
 			unlink_or_warn(git_path_bisect_ancestors_ok());
-			unlink_or_warn(git_path_bisect_expected_rev());
+			delete_ref(NULL, "BISECT_EXPECTED_REV", NULL, REF_NO_DEREF);
 			verify_expected = 0;
 		}
 	}
diff --git a/refs.c b/refs.c
index 2d39d3fe80..0290bb0c67 100644
--- a/refs.c
+++ b/refs.c
@@ -1845,6 +1845,7 @@ static int is_special_ref(const char *refname)
 	 * There are some exceptions that you might expect to see on this list
 	 * but which are handled exclusively via the reference backend:
 	 *
+	 * - BISECT_EXPECTED_REV
 	 * - CHERRY_PICK_HEAD
 	 * - HEAD
 	 * - ORIG_HEAD
@@ -1855,7 +1856,6 @@ static int is_special_ref(const char *refname)
 	 */
 	const char * const special_refs[] = {
 		"AUTO_MERGE",
-		"BISECT_EXPECTED_REV",
 		"FETCH_HEAD",
 		"MERGE_AUTOSTASH",
 		"MERGE_HEAD",
diff --git a/t/t6030-bisect-porcelain.sh b/t/t6030-bisect-porcelain.sh
index 2a5b7d8379..792c1504bc 100755
--- a/t/t6030-bisect-porcelain.sh
+++ b/t/t6030-bisect-porcelain.sh
@@ -1176,7 +1176,7 @@ test_expect_success 'git bisect reset cleans bisection state properly' '
 	git bisect bad $HASH4 &&
 	git bisect reset &&
 	test -z "$(git for-each-ref "refs/bisect/*")" &&
-	test_path_is_missing ".git/BISECT_EXPECTED_REV" &&
+	test_ref_missing BISECT_EXPECTED_REV &&
 	test_path_is_missing ".git/BISECT_ANCESTORS_OK" &&
 	test_path_is_missing ".git/BISECT_LOG" &&
 	test_path_is_missing ".git/BISECT_RUN" &&
-- 
2.43.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related

* Re: [PATCH] setup: recognize bare repositories with packed-refs
From: Patrick Steinhardt @ 2023-11-29 10:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Adam Majer, git
In-Reply-To: <20231128190446.GA10477@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 5736 bytes --]

On Tue, Nov 28, 2023 at 02:04:46PM -0500, Jeff King wrote:
> On Tue, Nov 28, 2023 at 03:28:45PM +0100, Adam Majer wrote:
> 
> > In a garbage collected bare git repository, the refs/ subdirectory is
> > empty.  In use-cases when such a repository is directly added into
> > another repository, it no longer is detected as valid. Git doesn't
> > preserve empty paths so refs/ subdirectory is not present. Simply
> > creating an empty refs/ subdirectory fixes this problem.
> 
> I understand your use case, but I still have a vague feeling that this
> is bending some assumptions in a way that may create problems or
> confusion later. In particular:
> 
> > Looking more carefully, there are two backends to handle various refs in
> > git -- the files backend that uses refs/ subdirectory and the
> > packed-refs backend that uses packed-refs file. If references are not
> > found in refs/ subdirectory (or directory doesn't exist), the
> > packed-refs directory will be consulted. Garbage collected repository
> > will have all its references in packed-refs file.
> 
> This second paragraph doesn't seem totally accurate to me. There are not
> really two backends that Git can use. For production use, there is just
> one, the "files" backend, which happens to also use packed-refs under
> the hood (and a convenient way for the code to structure this was a
> subordinate backend). But it has never been possible to have a repo that
> just uses packed-refs.
> 
> There is also the experimental reftable, of course. And there we have
> not yet loosened is_git_directory(), and it has to create an unused
> "refs/" directory (there has been some discussion about allowing it to
> be an empty file, though no patches have been merged).

As I'm currently working on the reftable backend this thought has also
crossed my mind. The reftable backend doesn't only create "refs/", but
it also creates "HEAD" with contents "ref: refs/heads/.invalid" so that
Git commands recognize the Git directory properly. Longer-term I would
really love to see us doing a better job of detecting Git repositories
so that we don't have to carry this legacy baggage around.

I can see different ways for how to do this:

    - Either we iterate through all known reference backends, asking
      each of them whether they recognize the directory as something
      they understand.

    - Or we start parsing the gitconfig of the repository so that we can
      learn about which reference backend to expect, and then ask that
      specific backend whether it thinks that the directory indeed looks
      like something it can handle.

I'd personally prefer the latter, but I'm not sure whether we really
want to try and parse any file that happens to be called "config".

> So with regards to the loosening in your patch, my questions would be:
> 
>   - if we are going to change the rules for repository detection, is
>     this where we want to end up? We haven't changed them (yet) for
>     reftables. If we are going to do so, should we have a scheme that
>     will work for that transition, too? The "refs is an empty file"
>     scheme would fix your use case, too (though see below).
> 
>   - is the rest of Git ready to handle a missing "refs/" directory? It
>     looks like making a ref will auto-create it (since we may have to
>     make refs/foo/bar/... anyway).
> 
>   - what about other implementations? Your embedded repos will
>     presumably not work with libgit2, jgit, etc, until they also get
>     similar patches.
> 
>   - what about empty repositories? In that case there will be no "refs/"
>     file and no "packed-refs" file (such a repository is less likely, of
>     course, but it may contain objects but no refs, or the point may be
>     to have an empty repo as a test vector). Likewise, it is possible
>     for a repository to have an empty "objects" directory (even with a
>     non-empty refs directory, if there are only symrefs), and your patch
>     doesn't address that.

Just throwing this out there, but we could use this as an excuse to
introduce "extensions.refFormat". If it's explicitly configured to be
"reffiles" then we accept repositories even if they don't have the
"refs/" directory or a "packed-refs" file. This would still require work
in alternative implementations of Git, but this work will need to happen
anyway when the reftable backend lands.

I'd personally love for this extension to be introduced before I'm
sending the reftable backend upstream so that we can have discussions
around it beforehand.

Patrick

> > To allow the use-case when packed-refs is the only source of refs and
> > refs/ subdirectory is simply not present, augment 'is_git_directory()'
> > setup function to look for packed-refs file as an alternative to refs/
> > subdirectory.
> 
> Getting back to your use case, I'd suggest one of:
> 
>   - do the usual "touch refs/.gitignore" trick to explicitly track the
>     empty directory. It looks like the ref code will ignore this (we
>     don't allow ref names to start with "." in a path component)
> 
>   - whatever is consuming the embedded repos could "mkdir -p refs
>     objects" as needed. This is a minor pain, but I think in the long
>     term we are moving to a world where you have to explicitly do
>     "GIT_DIR=$PWD/embedded.git" to access an embedded bare repo. So
>     they're already special and require some setup; adding an extra step
>     may not be so bad.
> 
> Now it may be that neither of those solutions is acceptable for various
> reasons. But it is probably worth detailing those reasons in your commit
> message.
> 
> -Peff
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Draft of Git Rev News edition 105
From: Christian Couder @ 2023-11-29 18:34 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jakub Narebski, Markus Jansen, Kaartic Sivaraam,
	Taylor Blau, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason, Bruno Brito,
	Alexander Shopov, René Scharfe, Philip Oakley, Jeff King,
	Jason Hatton, brian m. carlson, Eric Sunshine,
	Carlo Marcelo Arenas Belón, Torsten Bögershausen

Hi everyone!

A draft of a new Git Rev News edition is available here:

  https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-105.md

Everyone is welcome to contribute in any section either by editing the
above page on GitHub and sending a pull request, or by commenting on
this GitHub issue:

  https://github.com/git/git.github.io/issues/670

You can also reply to this email.

In general all kinds of contributions, for example proofreading,
suggestions for articles or links, help on the issues in GitHub,
volunteering for being interviewed and so on, are very much
appreciated.

I tried to Cc everyone who appears in this edition, but maybe I missed
some people, sorry about that.

Jakub, Markus, Kaartic and I plan to publish this edition on
Friday December 1st.

Thanks,
Christian.

^ permalink raw reply

* Re: [PATCH] setup: recognize bare repositories with packed-refs
From: Taylor Blau @ 2023-11-29 21:30 UTC (permalink / raw)
  To: Jeff King; +Cc: Adam Majer, git
In-Reply-To: <20231128190446.GA10477@coredump.intra.peff.net>

On Tue, Nov 28, 2023 at 02:04:46PM -0500, Jeff King wrote:
>   - whatever is consuming the embedded repos could "mkdir -p refs
>     objects" as needed. This is a minor pain, but I think in the long
>     term we are moving to a world where you have to explicitly do
>     "GIT_DIR=$PWD/embedded.git" to access an embedded bare repo. So
>     they're already special and require some setup; adding an extra step
>     may not be so bad.

I hope not. I suppose that using embedded bare repositories in a test
requires additional setup at least to "cd" into the directory (if they
are not using `$GIT_DIR` or `--git-dir` already). But I fear that
imposing even a small change like this is too tall an order for how many
millions of these exist in the wild across all sorts of projects.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 1/4] wt-status: read HEAD and ORIG_HEAD via the refdb
From: Taylor Blau @ 2023-11-29 21:45 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, hanwenn
In-Reply-To: <35b74eb972eed7e08190e826fabcf6b7a241f285.1701243201.git.ps@pks.im>

On Wed, Nov 29, 2023 at 09:14:12AM +0100, Patrick Steinhardt wrote:
> We read both the HEAD and ORIG_HEAD references directly from the
> filesystem in order to figure out whether we're currently splitting a
> commit. If both of the following are true:
>
>   - HEAD points to the same object as "rebase-merge/amend".
>
>   - ORIG_HEAD points to the same object as "rebase-merge/orig-head".
>
> Then we are currently splitting commits.
>
> The current code only works by chance because we only have a single
> reference backend implementation. Refactor it to instead read both refs
> via the refdb layer so that we'll also be compatible with alternate
> reference backends.
>
> Note that we pass `RESOLVE_REF_NO_RECURSE` to `read_ref_full()`. This is
> because we didn't resolve symrefs before either, and in practice none of
> the refs in "rebase-merge/" would be symbolic. We thus don't want to
> resolve symrefs with the new code either to retain the old behaviour.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  wt-status.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/wt-status.c b/wt-status.c
> index 9f45bf6949..fe9e590b80 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -1295,26 +1295,27 @@ static char *read_line_from_git_path(const char *filename)
>  static int split_commit_in_progress(struct wt_status *s)
>  {
>  	int split_in_progress = 0;
> -	char *head, *orig_head, *rebase_amend, *rebase_orig_head;
> +	struct object_id head_oid, orig_head_oid;
> +	char *rebase_amend, *rebase_orig_head;
>
>  	if ((!s->amend && !s->nowarn && !s->workdir_dirty) ||
>  	    !s->branch || strcmp(s->branch, "HEAD"))
>  		return 0;
>
> -	head = read_line_from_git_path("HEAD");
> -	orig_head = read_line_from_git_path("ORIG_HEAD");
> +	if (read_ref_full("HEAD", RESOLVE_REF_NO_RECURSE, &head_oid, NULL) ||
> +	    read_ref_full("ORIG_HEAD", RESOLVE_REF_NO_RECURSE, &orig_head_oid, NULL))

Switching to read_ref_full() here is going to have some slightly
different behavior than just reading out the contents of
"$GIT_DIR/HEAD", but I think that it should be OK.

Before we would not have complained, if, for example, the contents of
"$GIT_DIR/HEAD" were malformed, but now we will. I think that's OK,
especially given that if that file is bogus, we'll have other problems
before we get here ;-).

Are there any other gotchas that we should be thinking about?

> +		return 0;
> +
>  	rebase_amend = read_line_from_git_path("rebase-merge/amend");
>  	rebase_orig_head = read_line_from_git_path("rebase-merge/orig-head");
>
> -	if (!head || !orig_head || !rebase_amend || !rebase_orig_head)
> +	if (!rebase_amend || !rebase_orig_head)
>  		; /* fall through, no split in progress */
>  	else if (!strcmp(rebase_amend, rebase_orig_head))
> -		split_in_progress = !!strcmp(head, rebase_amend);
> -	else if (strcmp(orig_head, rebase_orig_head))
> +		split_in_progress = !!strcmp(oid_to_hex(&head_oid), rebase_amend);
> +	else if (strcmp(oid_to_hex(&orig_head_oid), rebase_orig_head))

I did a double take at these strcmp(oid_to_hex(...)) calls, but I think
that they are the best that we can do given that we're still reading the
contents of "rebase-merge/amend" and "rebase-merge/orig-head" directly.

I suppose we could go the other way and turn their contents into
object_ids and then use oidcmp(), but it doesn't seem worth it IMHO.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 2/4] refs: propagate errno when reading special refs fails
From: Taylor Blau @ 2023-11-29 21:51 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, hanwenn
In-Reply-To: <691552a17ec587b0c03e758437c33d58767803aa.1701243201.git.ps@pks.im>

On Wed, Nov 29, 2023 at 09:14:16AM +0100, Patrick Steinhardt wrote:
> diff --git a/refs.c b/refs.c
> index fcae5dddc6..7d4a057f36 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1806,8 +1806,12 @@ static int refs_read_special_head(struct ref_store *ref_store,
>  	int result = -1;
>  	strbuf_addf(&full_path, "%s/%s", ref_store->gitdir, refname);
>
> -	if (strbuf_read_file(&content, full_path.buf, 0) < 0)
> +	errno = 0;

Do we need to set errno to 0 here? Looking at the implementation of
strbuf_read_file(), it looks like we return early in two cases. Either
open() fails, in which errno is set for us, or strbuf_read() fails, in
which case we set errno to whatever it was right after the failed read
(preventing the subsequent close() call from tainting the value of errno).

So I think in either case, we have the right value in errno, and don't
need to worry about setting it to "0" ahead of time.

> +test_expect_success '--exists with existing special ref' '
> +	git rev-parse HEAD >.git/FETCH_HEAD &&
> +	git show-ref --exists FETCH_HEAD
> +'

I don't think that it matters here, but do we need to worry about
cleaning up .git/FETCH_HEAD for future tests?

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 3/4] refs: complete list of special refs
From: Taylor Blau @ 2023-11-29 21:59 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, hanwenn
In-Reply-To: <0e38103114a206bedbbbd7ea97cb77fa05fd3c29.1701243201.git.ps@pks.im>

On Wed, Nov 29, 2023 at 09:14:20AM +0100, Patrick Steinhardt wrote:
> We have some references that are more special than others. The reason
> for them being special is that they either do not follow the usual
> format of references, or that they are written to the filesystem
> directly by the respective owning subsystem and thus circumvent the
> reference backend.
>
> This works perfectly fine right now because the reffiles backend will
> know how to read those refs just fine. But with the prospect of gaining
> a new reference backend implementation we need to be a lot more careful
> here:
>
>   - We need to make sure that we are consistent about how those refs are
>     written. They must either always be written via the filesystem, or
>     they must always be written via the reference backend. Any mixture
>     will lead to inconsistent state.
>
>   - We need to make sure that such special refs are always handled
>     specially when reading them.
>
> We're already mostly good with regard to the first item, except for
> `BISECT_EXPECTED_REV` which will be addressed in a subsequent commit.
> But the current list of special refs is missing a lot of refs that
> really should be treated specially. Right now, we only treat
> `FETCH_HEAD` and `MERGE_HEAD` specially here.
>
> Introduce a new function `is_special_ref()` that contains all current
> instances of special refs to fix the reading path.
>
> Based-on-patch-by: Han-Wen Nienhuys <hanwenn@gmail.com>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  refs.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 56 insertions(+), 2 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 7d4a057f36..2d39d3fe80 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1822,15 +1822,69 @@ static int refs_read_special_head(struct ref_store *ref_store,
>  	return result;
>  }
>
> +static int is_special_ref(const char *refname)
> +{
> +	/*
> +	 * Special references get written and read directly via the filesystem
> +	 * by the subsystems that create them. Thus, they must not go through
> +	 * the reference backend but must instead be read directly. It is
> +	 * arguable whether this behaviour is sensible, or whether it's simply
> +	 * a leaky abstraction enabled by us only having a single reference
> +	 * backend implementation. But at least for a subset of references it
> +	 * indeed does make sense to treat them specially:
> +	 *
> +	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
> +	 *   carries additional metadata like where it came from.
> +	 *
> +	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
> +	 *   heads.
> +	 *
> +	 * - "rebase-apply/" and "rebase-merge/" contain all of the state for
> +	 *   rebases, where keeping it closely together feels sensible.
> +	 *
> +	 * There are some exceptions that you might expect to see on this list
> +	 * but which are handled exclusively via the reference backend:
> +	 *
> +	 * - CHERRY_PICK_HEAD
> +	 * - HEAD
> +	 * - ORIG_HEAD
> +	 *
> +	 * Writing or deleting references must consistently go either through
> +	 * the filesystem (special refs) or through the reference backend
> +	 * (normal ones).
> +	 */
> +	const char * const special_refs[] = {
> +		"AUTO_MERGE",
> +		"BISECT_EXPECTED_REV",
> +		"FETCH_HEAD",
> +		"MERGE_AUTOSTASH",
> +		"MERGE_HEAD",
> +	};

Is there a reason that we don't want to declare this statically? If we
did, I think we could drop one const, since the strings would instead
reside in the .rodata section.

> +	int i;

Not that it matters for this case, but it may be worth declaring i to be
an unsigned type, since it's used as an index into an array. size_t
seems like an appropriate choice there.

> +	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
> +		if (!strcmp(refname, special_refs[i]))
> +			return 1;
> +
> +	/*
> +	 * git-rebase(1) stores its state in `rebase-apply/` or
> +	 * `rebase-merge/`, including various reference-like bits.
> +	 */
> +	if (starts_with(refname, "rebase-apply/") ||
> +	    starts_with(refname, "rebase-merge/"))

Do we care about case sensitivity here? Definitely not on case-sensitive
filesystems, but I'm not sure about case-insensitive ones. For instance,
on macOS, I can do:

    $ git rev-parse hEAd

and get the same value as "git rev-parse HEAD" (on my Linux workstation,
this fails as expected).

I doubt that there are many users in the wild asking to resolve
reBASe-APPLY/xyz, but I think that after this patch that would no longer
work as-is, so we may want to replace this with istarts_with() instead.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 4/4] bisect: consistently write BISECT_EXPECTED_REV via the refdb
From: Taylor Blau @ 2023-11-29 22:13 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, hanwenn
In-Reply-To: <c7ab26fb7e9e24b12b83dc26fbc17ff4d96e206c.1701243201.git.ps@pks.im>

On Wed, Nov 29, 2023 at 09:14:24AM +0100, Patrick Steinhardt wrote:
> ---
>  bisect.c                    | 25 ++++---------------------
>  builtin/bisect.c            |  8 ++------
>  refs.c                      |  2 +-
>  t/t6030-bisect-porcelain.sh |  2 +-
>  4 files changed, 8 insertions(+), 29 deletions(-)

Very nice :-).

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 0/4] refs: improve handling of special refs
From: Taylor Blau @ 2023-11-29 22:14 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, hanwenn
In-Reply-To: <cover.1701243201.git.ps@pks.im>

On Wed, Nov 29, 2023 at 09:14:07AM +0100, Patrick Steinhardt wrote:
> Patrick Steinhardt (4):
>   wt-status: read HEAD and ORIG_HEAD via the refdb
>   refs: propagate errno when reading special refs fails
>   refs: complete list of special refs
>   bisect: consistently write BISECT_EXPECTED_REV via the refdb
>
>  bisect.c                    | 25 +++------------
>  builtin/bisect.c            |  8 ++---
>  refs.c                      | 64 +++++++++++++++++++++++++++++++++++--
>  t/t1403-show-ref.sh         |  9 ++++++
>  t/t6030-bisect-porcelain.sh |  2 +-
>  wt-status.c                 | 17 +++++-----
>  6 files changed, 86 insertions(+), 39 deletions(-)

These all look pretty good to me. I had a few minor questions on the
first three patches, but I don't think they necessarily require a
reroll.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 01/10] t0410: mark tests to require the reffiles backend
From: Taylor Blau @ 2023-11-29 22:19 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <53c6348035360912a9d720448dceb17895703da2.1701242407.git.ps@pks.im>

On Wed, Nov 29, 2023 at 08:24:40AM +0100, Patrick Steinhardt wrote:
> Two of our tests in t0410 verify whether partial clones end up with the
> correct repository format version and extensions. These checks require
> the reffiles backend because every other backend would by necessity bump
> the repository format version to be at least 1.
>
> Mark the tests accordingly.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  t/t0410-partial-clone.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 5b7bee888d..6b6424b3df 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -49,7 +49,7 @@ test_expect_success 'convert shallow clone to partial clone' '
>  	test_cmp_config -C client 1 core.repositoryformatversion
>  '
>
> -test_expect_success SHA1 'convert to partial clone with noop extension' '
> +test_expect_success SHA1,REFFILES 'convert to partial clone with noop extension' '

I thought for a second that the SHA1 prerequisite would cover this
already, but that's not right since you can be in SHA1 mode even if your
repositoryformatversion is 1. So this change makes sense to me and is
well-reasoned.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 08/10] t5401: speed up creation of many branches
From: Taylor Blau @ 2023-11-29 22:25 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <f674119c7801e355cd08a651450abd67947d7456.1701242407.git.ps@pks.im>

On Wed, Nov 29, 2023 at 08:25:09AM +0100, Patrick Steinhardt wrote:
> One of the tests in t5401 creates a bunch of branches by calling
> git-branch(1) for every one of them. This is quite inefficient and takes
> a comparatively long time even on Unix systems where spawning processes
> is comparatively fast. Refactor it to instead use git-update-ref(1),
> which leads to an almost 10-fold speedup:
>
> ```
> Benchmark 1: ./t5401-update-hooks.sh (rev = HEAD)
>   Time (mean ± σ):     983.2 ms ±  97.6 ms    [User: 328.8 ms, System: 679.2 ms]
>   Range (min … max):   882.9 ms … 1078.0 ms    3 runs
>
> Benchmark 2: ./t5401-update-hooks.sh (rev = HEAD~)
>   Time (mean ± σ):      9.312 s ±  0.398 s    [User: 2.766 s, System: 6.617 s]
>   Range (min … max):    8.885 s …  9.674 s    3 runs
>
> Summary
>   ./t5401-update-hooks.sh (rev = HEAD) ran
>     9.47 ± 1.02 times faster than ./t5401-update-hooks.sh (rev = HEAD~)

Very nice ;-).

> diff --git a/t/t5401-update-hooks.sh b/t/t5401-update-hooks.sh
> index 001b7a17ad..8b8bc47dc0 100755
> --- a/t/t5401-update-hooks.sh
> +++ b/t/t5401-update-hooks.sh
> @@ -133,10 +133,8 @@ test_expect_success 'pre-receive hook that forgets to read its input' '
>  	EOF
>  	rm -f victim.git/hooks/update victim.git/hooks/post-update &&
>
> -	for v in $(test_seq 100 999)
> -	do
> -		git branch branch_$v main || return
> -	done &&
> +	printf "create refs/heads/branch_%d main\n" $(test_seq 100 999) >input &&
> +	git update-ref --stdin <input &&

Not that it really matters here, but you could pipe the output of your
printf directly into git update-ref. I don't think we rely on the value
of "input" after this point, and git is on the right-hand side of the
pipe, so this is safe to do.

But it doesn't matter much either way, just something I noticed while
reading.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 09/10] t5551: stop writing packed-refs directly
From: Taylor Blau @ 2023-11-29 22:26 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <2ab24ea5633be6b4855567d126a184d54c624f62.1701242407.git.ps@pks.im>

On Wed, Nov 29, 2023 at 08:25:14AM +0100, Patrick Steinhardt wrote:
> We have multiple tests in t5551 that write thousands of tags. To do so
> efficiently we generate the tags by writing the `packed-refs` file
> directly, which of course assumes that the reference database is backed
> by the files backend.
>
> Refactor the code to instead use a single `git update-ref --stdin`
> command to write the tags. While the on-disk end result is not the same
> as we now have a bunch of loose refs instead of a single packed-refs
> file, the distinction shouldn't really matter for any of the tests that
> use this helper.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  t/t5551-http-fetch-smart.sh | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index 8a41adf1e1..e069737b80 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -359,7 +359,9 @@ create_tags () {
>
>  	# now assign tags to all the dangling commits we created above
>  	tag=$(perl -e "print \"bla\" x 30") &&
> -	sed -e "s|^:\([^ ]*\) \(.*\)$|\2 refs/tags/$tag-\1|" <marks >>packed-refs
> +	sed -e "s|^:\([^ ]*\) \(.*\)$|create refs/tags/$tag-\1 \2|" <marks >input &&
> +	git update-ref --stdin <input &&
> +	rm input

Same note here as in the previous patch, although here I think we'd
prefer to pipe from sed to git update-ref directly, rather than writing
an intermediate file which we have to clean up afterwords.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH 00/10] t: more compatibility fixes with reftables
From: Taylor Blau @ 2023-11-29 23:30 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <cover.1701242407.git.ps@pks.im>

On Wed, Nov 29, 2023 at 08:24:36AM +0100, Patrick Steinhardt wrote:
> Hi,
>
> this is the second patch series that refactors tests to become
> compatible with the upcoming reftables backend. It's in the same spirit
> as the first round of patches [1], where most of the refactorings are to
> use plumbing tools to access refs or the reflog instead of modifying
> on-disk data structures directly.

All looks good to me. Thanks for a pleasant read :-).

Thanks,
Taylor

^ permalink raw reply

* [PATCH] Mention default oldbranch in git-branch doc
From: Clarence "Sparr" Risher via GitGitGadget @ 2023-11-30  0:24 UTC (permalink / raw)
  To: git; +Cc: Clarence "Sparr" Risher, Clarence "Sparr" Risher

From: "Clarence \"Sparr\" Risher" <sparr0@gmail.com>

`git branch -m` has an optional first argument, the branch to rename.
If omitted, it defaults to the current branch. This behavior is not
currently described in the documentation. While it can be determined
relatively easily through experimentation, and less so through viewing
the source code (builtin/branch.c:897), it is not obvious what such a
command will do when encountered in a less interactive context.

This patch adds one sentence to the git-branch documentation, with
wording based on another optional argument described earlier in the
same doc.

The same behavior applies to -M, -c, and -C, which are all covered by
this same section of the documentation.

Signed-off-by: Clarence Risher <clarence@kira-learning.com>
Signed-off-by: Clarence "Sparr" Risher <sparr0@gmail.com>
---
    Mention default oldbranch in git-branch doc

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1613%2Fsparr%2Fdoc_branch_rename_one_param-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1613/sparr/doc_branch_rename_one_param-v1
Pull-Request: https://github.com/git/git/pull/1613

 Documentation/git-branch.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
index 4395aa93543..affed1b17a4 100644
--- a/Documentation/git-branch.txt
+++ b/Documentation/git-branch.txt
@@ -73,6 +73,7 @@ overridden by using the `--track` and `--no-track` options, and
 changed later using `git branch --set-upstream-to`.
 
 With a `-m` or `-M` option, <oldbranch> will be renamed to <newbranch>.
+If the <oldbranch> argument is missing it defaults to the current branch.
 If <oldbranch> had a corresponding reflog, it is renamed to match
 <newbranch>, and a reflog entry is created to remember the branch
 renaming. If <newbranch> exists, -M must be used to force the rename

base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH 00/10] t: more compatibility fixes with reftables
From: Patrick Steinhardt @ 2023-11-30  7:06 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git
In-Reply-To: <ZWfJp80vzVhbdH89@nand.local>

[-- Attachment #1: Type: text/plain, Size: 809 bytes --]

On Wed, Nov 29, 2023 at 06:30:47PM -0500, Taylor Blau wrote:
> On Wed, Nov 29, 2023 at 08:24:36AM +0100, Patrick Steinhardt wrote:
> > Hi,
> >
> > this is the second patch series that refactors tests to become
> > compatible with the upcoming reftables backend. It's in the same spirit
> > as the first round of patches [1], where most of the refactorings are to
> > use plumbing tools to access refs or the reflog instead of modifying
> > on-disk data structures directly.
> 
> All looks good to me. Thanks for a pleasant read :-).
> 
> Thanks,
> Taylor

If you don't mind, I'll refrain from sending a v2 only to start piping
output into git-update-ref(1) directly. I'll incorporate that feedback
though if there is a need for v2 due to other reasons.

Thanks for your review!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 1/4] wt-status: read HEAD and ORIG_HEAD via the refdb
From: Patrick Steinhardt @ 2023-11-30  7:42 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, hanwenn
In-Reply-To: <ZWew3CP4QJ4XDnHj@nand.local>

[-- Attachment #1: Type: text/plain, Size: 4365 bytes --]

On Wed, Nov 29, 2023 at 04:45:00PM -0500, Taylor Blau wrote:
> On Wed, Nov 29, 2023 at 09:14:12AM +0100, Patrick Steinhardt wrote:
> > We read both the HEAD and ORIG_HEAD references directly from the
> > filesystem in order to figure out whether we're currently splitting a
> > commit. If both of the following are true:
> >
> >   - HEAD points to the same object as "rebase-merge/amend".
> >
> >   - ORIG_HEAD points to the same object as "rebase-merge/orig-head".
> >
> > Then we are currently splitting commits.
> >
> > The current code only works by chance because we only have a single
> > reference backend implementation. Refactor it to instead read both refs
> > via the refdb layer so that we'll also be compatible with alternate
> > reference backends.
> >
> > Note that we pass `RESOLVE_REF_NO_RECURSE` to `read_ref_full()`. This is
> > because we didn't resolve symrefs before either, and in practice none of
> > the refs in "rebase-merge/" would be symbolic. We thus don't want to
> > resolve symrefs with the new code either to retain the old behaviour.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  wt-status.c | 17 +++++++++--------
> >  1 file changed, 9 insertions(+), 8 deletions(-)
> >
> > diff --git a/wt-status.c b/wt-status.c
> > index 9f45bf6949..fe9e590b80 100644
> > --- a/wt-status.c
> > +++ b/wt-status.c
> > @@ -1295,26 +1295,27 @@ static char *read_line_from_git_path(const char *filename)
> >  static int split_commit_in_progress(struct wt_status *s)
> >  {
> >  	int split_in_progress = 0;
> > -	char *head, *orig_head, *rebase_amend, *rebase_orig_head;
> > +	struct object_id head_oid, orig_head_oid;
> > +	char *rebase_amend, *rebase_orig_head;
> >
> >  	if ((!s->amend && !s->nowarn && !s->workdir_dirty) ||
> >  	    !s->branch || strcmp(s->branch, "HEAD"))
> >  		return 0;
> >
> > -	head = read_line_from_git_path("HEAD");
> > -	orig_head = read_line_from_git_path("ORIG_HEAD");
> > +	if (read_ref_full("HEAD", RESOLVE_REF_NO_RECURSE, &head_oid, NULL) ||
> > +	    read_ref_full("ORIG_HEAD", RESOLVE_REF_NO_RECURSE, &orig_head_oid, NULL))
> 
> Switching to read_ref_full() here is going to have some slightly
> different behavior than just reading out the contents of
> "$GIT_DIR/HEAD", but I think that it should be OK.
> 
> Before we would not have complained, if, for example, the contents of
> "$GIT_DIR/HEAD" were malformed, but now we will. I think that's OK,
> especially given that if that file is bogus, we'll have other problems
> before we get here ;-).
> 
> Are there any other gotchas that we should be thinking about?

Not that I can think of. As you say, a repository with malformed HEAD
will run into other problems anyway. And `read_ref_full()` would return
errors if these refs were malformed, which would cause us to exit early
from anyway. So unless "rebase-merge/amend" and "rebase-merge/orig-head"
contained the same kind of garbage we'd retain the same behaviour as
before, and that shouldn't really be happening.

One interesting bit is that we don't set `RESOLVE_REF_READING`, so
`read_ref_full()` may return successfully even if the ref doesn't exist.
But in practice this is fine given that the resulting oid would be
cleared in that case.

Patrick

> > +		return 0;
> > +
> >  	rebase_amend = read_line_from_git_path("rebase-merge/amend");
> >  	rebase_orig_head = read_line_from_git_path("rebase-merge/orig-head");
> >
> > -	if (!head || !orig_head || !rebase_amend || !rebase_orig_head)
> > +	if (!rebase_amend || !rebase_orig_head)
> >  		; /* fall through, no split in progress */
> >  	else if (!strcmp(rebase_amend, rebase_orig_head))
> > -		split_in_progress = !!strcmp(head, rebase_amend);
> > -	else if (strcmp(orig_head, rebase_orig_head))
> > +		split_in_progress = !!strcmp(oid_to_hex(&head_oid), rebase_amend);
> > +	else if (strcmp(oid_to_hex(&orig_head_oid), rebase_orig_head))
> 
> I did a double take at these strcmp(oid_to_hex(...)) calls, but I think
> that they are the best that we can do given that we're still reading the
> contents of "rebase-merge/amend" and "rebase-merge/orig-head" directly.
> 
> I suppose we could go the other way and turn their contents into
> object_ids and then use oidcmp(), but it doesn't seem worth it IMHO.
> 
> Thanks,
> Taylor

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 2/4] refs: propagate errno when reading special refs fails
From: Patrick Steinhardt @ 2023-11-30  7:43 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, hanwenn
In-Reply-To: <ZWeyUU/NqmGUvyOL@nand.local>

[-- Attachment #1: Type: text/plain, Size: 1518 bytes --]

On Wed, Nov 29, 2023 at 04:51:13PM -0500, Taylor Blau wrote:
> On Wed, Nov 29, 2023 at 09:14:16AM +0100, Patrick Steinhardt wrote:
> > diff --git a/refs.c b/refs.c
> > index fcae5dddc6..7d4a057f36 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -1806,8 +1806,12 @@ static int refs_read_special_head(struct ref_store *ref_store,
> >  	int result = -1;
> >  	strbuf_addf(&full_path, "%s/%s", ref_store->gitdir, refname);
> >
> > -	if (strbuf_read_file(&content, full_path.buf, 0) < 0)
> > +	errno = 0;
> 
> Do we need to set errno to 0 here? Looking at the implementation of
> strbuf_read_file(), it looks like we return early in two cases. Either
> open() fails, in which errno is set for us, or strbuf_read() fails, in
> which case we set errno to whatever it was right after the failed read
> (preventing the subsequent close() call from tainting the value of errno).
> 
> So I think in either case, we have the right value in errno, and don't
> need to worry about setting it to "0" ahead of time.

True. I'll drop this when rerolling.

> > +test_expect_success '--exists with existing special ref' '
> > +	git rev-parse HEAD >.git/FETCH_HEAD &&
> > +	git show-ref --exists FETCH_HEAD
> > +'
> 
> I don't think that it matters here, but do we need to worry about
> cleaning up .git/FETCH_HEAD for future tests?

There's so many tests where I wish that we did a better job of cleanup,
so I agree that it is sensible to clean up after ourselves. Will drop
when rerolling.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 3/4] refs: complete list of special refs
From: Patrick Steinhardt @ 2023-11-30  7:44 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, hanwenn
In-Reply-To: <ZWe0RzOoHI9QZMox@nand.local>

[-- Attachment #1: Type: text/plain, Size: 5442 bytes --]

On Wed, Nov 29, 2023 at 04:59:35PM -0500, Taylor Blau wrote:
> On Wed, Nov 29, 2023 at 09:14:20AM +0100, Patrick Steinhardt wrote:
> > We have some references that are more special than others. The reason
> > for them being special is that they either do not follow the usual
> > format of references, or that they are written to the filesystem
> > directly by the respective owning subsystem and thus circumvent the
> > reference backend.
> >
> > This works perfectly fine right now because the reffiles backend will
> > know how to read those refs just fine. But with the prospect of gaining
> > a new reference backend implementation we need to be a lot more careful
> > here:
> >
> >   - We need to make sure that we are consistent about how those refs are
> >     written. They must either always be written via the filesystem, or
> >     they must always be written via the reference backend. Any mixture
> >     will lead to inconsistent state.
> >
> >   - We need to make sure that such special refs are always handled
> >     specially when reading them.
> >
> > We're already mostly good with regard to the first item, except for
> > `BISECT_EXPECTED_REV` which will be addressed in a subsequent commit.
> > But the current list of special refs is missing a lot of refs that
> > really should be treated specially. Right now, we only treat
> > `FETCH_HEAD` and `MERGE_HEAD` specially here.
> >
> > Introduce a new function `is_special_ref()` that contains all current
> > instances of special refs to fix the reading path.
> >
> > Based-on-patch-by: Han-Wen Nienhuys <hanwenn@gmail.com>
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  refs.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 56 insertions(+), 2 deletions(-)
> >
> > diff --git a/refs.c b/refs.c
> > index 7d4a057f36..2d39d3fe80 100644
> > --- a/refs.c
> > +++ b/refs.c
> > @@ -1822,15 +1822,69 @@ static int refs_read_special_head(struct ref_store *ref_store,
> >  	return result;
> >  }
> >
> > +static int is_special_ref(const char *refname)
> > +{
> > +	/*
> > +	 * Special references get written and read directly via the filesystem
> > +	 * by the subsystems that create them. Thus, they must not go through
> > +	 * the reference backend but must instead be read directly. It is
> > +	 * arguable whether this behaviour is sensible, or whether it's simply
> > +	 * a leaky abstraction enabled by us only having a single reference
> > +	 * backend implementation. But at least for a subset of references it
> > +	 * indeed does make sense to treat them specially:
> > +	 *
> > +	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
> > +	 *   carries additional metadata like where it came from.
> > +	 *
> > +	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
> > +	 *   heads.
> > +	 *
> > +	 * - "rebase-apply/" and "rebase-merge/" contain all of the state for
> > +	 *   rebases, where keeping it closely together feels sensible.
> > +	 *
> > +	 * There are some exceptions that you might expect to see on this list
> > +	 * but which are handled exclusively via the reference backend:
> > +	 *
> > +	 * - CHERRY_PICK_HEAD
> > +	 * - HEAD
> > +	 * - ORIG_HEAD
> > +	 *
> > +	 * Writing or deleting references must consistently go either through
> > +	 * the filesystem (special refs) or through the reference backend
> > +	 * (normal ones).
> > +	 */
> > +	const char * const special_refs[] = {
> > +		"AUTO_MERGE",
> > +		"BISECT_EXPECTED_REV",
> > +		"FETCH_HEAD",
> > +		"MERGE_AUTOSTASH",
> > +		"MERGE_HEAD",
> > +	};
> 
> Is there a reason that we don't want to declare this statically? If we
> did, I think we could drop one const, since the strings would instead
> reside in the .rodata section.

Not really, no.

> > +	int i;
> 
> Not that it matters for this case, but it may be worth declaring i to be
> an unsigned type, since it's used as an index into an array. size_t
> seems like an appropriate choice there.

Hm. We do use `int` almost everywhere when iterating through an array
via `ARRAY_SIZE`, but ultimately I don't mind whether it's `int`,
`unsigned` or `size_t`.

> > +	for (i = 0; i < ARRAY_SIZE(special_refs); i++)
> > +		if (!strcmp(refname, special_refs[i]))
> > +			return 1;
> > +
> > +	/*
> > +	 * git-rebase(1) stores its state in `rebase-apply/` or
> > +	 * `rebase-merge/`, including various reference-like bits.
> > +	 */
> > +	if (starts_with(refname, "rebase-apply/") ||
> > +	    starts_with(refname, "rebase-merge/"))
> 
> Do we care about case sensitivity here? Definitely not on case-sensitive
> filesystems, but I'm not sure about case-insensitive ones. For instance,
> on macOS, I can do:
> 
>     $ git rev-parse hEAd
> 
> and get the same value as "git rev-parse HEAD" (on my Linux workstation,
> this fails as expected).
> 
> I doubt that there are many users in the wild asking to resolve
> reBASe-APPLY/xyz, but I think that after this patch that would no longer
> work as-is, so we may want to replace this with istarts_with() instead.

In practice I'd argue that nobody is ever going to ask for something in
`rebase-apply/` outside of Git internals or scripts, and I'd expect
these to always use proper casing. So I rather lean towards a "no, we
don't care about case sensitivity".

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox