git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] The merge-base logic vs missing commit objects
@ 2024-02-13  8:41 Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 01/12] paint_down_to_common: plug a memory leak Johannes Schindelin via GitGitGadget
                   ` (13 more replies)
  0 siblings, 14 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin

This patch series is in the same spirit as what I proposed in
https://lore.kernel.org/git/pull.1651.git.1707212981.gitgitgadget@gmail.com/,
where I tackled missing tree objects. This here patch series tackles missing
commit objects instead: Git's merge-base logic handles these missing commit
objects as if there had not been any commit object at all, i.e. if two
commits' merge base commit is missing, they will be treated as if they had
no common commit history at all, which is a bug. Those commit objects should
not be missing, of course, i.e. this is only a problem in corrupt
repositories.

This patch series is a bit more tricky than the "missing tree objects" one,
though: The function signature of quite a few functions need to be changed
to allow callers to discern between "missing commit object" vs "no
merge-base found".

And of course it gets even more tricky because in shallow and partial clones
we expect commit objects to be missing, and that must not be treated like an
error but the existing logic is actually desirable in those scenarios.

I am deeply sorry both about the length of this patch series as well as
having to send it in the -rc phase.

Johannes Schindelin (12):
  paint_down_to_common: plug a memory leak
  Let `repo_in_merge_bases()` report missing commits
  Prepare `repo_in_merge_bases_many()` to optionally expect missing
    commits
  Prepare `paint_down_to_common()` for handling shallow commits
  commit-reach: start reporting errors in `paint_down_to_common()`
  merge_bases_many(): pass on errors from `paint_down_to_common()`
  get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases(): pass on errors from `merge_bases_many()`
  get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases_many_dirty(): pass on errors from
    `merge_bases_many()`
  paint_down_to_common(): special-case shallow/partial clones

 bisect.c                         |   7 +-
 builtin/branch.c                 |  12 +-
 builtin/fast-import.c            |   6 +-
 builtin/fetch.c                  |   2 +
 builtin/log.c                    |  30 +++--
 builtin/merge-base.c             |  18 ++-
 builtin/merge-tree.c             |   5 +-
 builtin/merge.c                  |  26 ++--
 builtin/pull.c                   |   9 +-
 builtin/rebase.c                 |   8 +-
 builtin/receive-pack.c           |   6 +-
 builtin/rev-parse.c              |   5 +-
 commit-reach.c                   | 202 +++++++++++++++++++------------
 commit-reach.h                   |  26 ++--
 commit.c                         |   7 +-
 diff-lib.c                       |   5 +-
 http-push.c                      |   5 +-
 log-tree.c                       |   5 +-
 merge-ort.c                      |  81 +++++++++++--
 merge-recursive.c                |  52 ++++++--
 notes-merge.c                    |   3 +-
 object-name.c                    |   5 +-
 remote.c                         |   2 +-
 revision.c                       |  10 +-
 sequencer.c                      |   8 +-
 shallow.c                        |  21 ++--
 submodule.c                      |   7 +-
 t/helper/test-reach.c            |  11 +-
 t/t4301-merge-tree-write-tree.sh |  12 ++
 29 files changed, 413 insertions(+), 183 deletions(-)


base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1657%2Fdscho%2Fmerge-base-and-missing-objects-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1657/dscho/merge-base-and-missing-objects-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1657
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH 01/12] paint_down_to_common: plug a memory leak
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 02/12] Let `repo_in_merge_bases()` report missing commits Johannes Schindelin via GitGitGadget
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When a commit is missing, we return early (currently pretending that no
merge basis could be found in that case). At that stage, it is possible
that a merge base could have been found already, and added to the
`result`, which is now leaked.

Let's release it, to address that memory leak.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/commit-reach.c b/commit-reach.c
index a868a575ea1..b2b102926c9 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -105,8 +105,10 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 			parents = parents->next;
 			if ((p->object.flags & flags) == flags)
 				continue;
-			if (repo_parse_commit(r, p))
+			if (repo_parse_commit(r, p)) {
+				free_commit_list(result);
 				return NULL;
+			}
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
 		}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 02/12] Let `repo_in_merge_bases()` report missing commits
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 01/12] paint_down_to_common: plug a memory leak Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-15  9:33   ` Patrick Steinhardt
  2024-02-13  8:41 ` [PATCH 03/12] Prepare `repo_in_merge_bases_many()` to optionally expect " Johannes Schindelin via GitGitGadget
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Some functions in Git's source code follow the convention that returning
a negative value indicates a fatal error, e.g. repository corruption.

Let's use this convention in `repo_in_merge_bases()` to report when one
of the specified commits is missing (i.e. when `repo_parse_commit()`
reports an error).

Also adjust the callers of `repo_in_merge_bases()` to handle such
negative return values.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/branch.c       | 12 +++++--
 builtin/fast-import.c  |  6 +++-
 builtin/fetch.c        |  2 ++
 builtin/log.c          |  7 ++--
 builtin/merge-base.c   |  6 +++-
 builtin/pull.c         |  4 +++
 builtin/receive-pack.c |  6 +++-
 commit-reach.c         | 12 ++++---
 http-push.c            |  5 ++-
 merge-ort.c            | 75 ++++++++++++++++++++++++++++++++++++------
 merge-recursive.c      | 48 ++++++++++++++++++++++-----
 shallow.c              | 18 ++++++----
 12 files changed, 163 insertions(+), 38 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index e7ee9bd0f15..7f9e79237f3 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -161,6 +161,8 @@ static int branch_merged(int kind, const char *name,
 
 	merged = reference_rev ? repo_in_merge_bases(the_repository, rev,
 						     reference_rev) : 0;
+	if (merged < 0)
+		exit(128);
 
 	/*
 	 * After the safety valve is fully redefined to "check with
@@ -169,9 +171,13 @@ static int branch_merged(int kind, const char *name,
 	 * any of the following code, but during the transition period,
 	 * a gentle reminder is in order.
 	 */
-	if ((head_rev != reference_rev) &&
-	    (head_rev ? repo_in_merge_bases(the_repository, rev, head_rev) : 0) != merged) {
-		if (merged)
+	if (head_rev != reference_rev) {
+		int expect = head_rev ? repo_in_merge_bases(the_repository, rev, head_rev) : 0;
+		if (expect < 0)
+			exit(128);
+		if (expect == merged)
+			; /* okay */
+		else if (merged)
 			warning(_("deleting branch '%s' that has been merged to\n"
 				"         '%s', but not yet merged to HEAD"),
 				name, reference_name);
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 444f41cf8ca..14c2efa88fc 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -1625,6 +1625,7 @@ static int update_branch(struct branch *b)
 		oidclr(&old_oid);
 	if (!force_update && !is_null_oid(&old_oid)) {
 		struct commit *old_cmit, *new_cmit;
+		int ret;
 
 		old_cmit = lookup_commit_reference_gently(the_repository,
 							  &old_oid, 0);
@@ -1633,7 +1634,10 @@ static int update_branch(struct branch *b)
 		if (!old_cmit || !new_cmit)
 			return error("Branch %s is missing commits.", b->name);
 
-		if (!repo_in_merge_bases(the_repository, old_cmit, new_cmit)) {
+		ret = repo_in_merge_bases(the_repository, old_cmit, new_cmit);
+		if (ret < 0)
+			exit(128);
+		if (!ret) {
 			warning("Not updating %s"
 				" (new tip %s does not contain %s)",
 				b->name, oid_to_hex(&b->oid),
diff --git a/builtin/fetch.c b/builtin/fetch.c
index fd134ba74d9..0584a1f8b64 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -978,6 +978,8 @@ static int update_local_ref(struct ref *ref,
 		uint64_t t_before = getnanotime();
 		fast_forward = repo_in_merge_bases(the_repository, current,
 						   updated);
+		if (fast_forward < 0)
+			exit(128);
 		forced_updates_ms += (getnanotime() - t_before) / 1000000;
 	} else {
 		fast_forward = 1;
diff --git a/builtin/log.c b/builtin/log.c
index ba775d7b5cf..1705da71aca 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1623,7 +1623,7 @@ static struct commit *get_base_commit(const char *base_commit,
 {
 	struct commit *base = NULL;
 	struct commit **rev;
-	int i = 0, rev_nr = 0, auto_select, die_on_failure;
+	int i = 0, rev_nr = 0, auto_select, die_on_failure, ret;
 
 	switch (auto_base) {
 	case AUTO_BASE_NEVER:
@@ -1723,7 +1723,10 @@ static struct commit *get_base_commit(const char *base_commit,
 		rev_nr = DIV_ROUND_UP(rev_nr, 2);
 	}
 
-	if (!repo_in_merge_bases(the_repository, base, rev[0])) {
+	ret = repo_in_merge_bases(the_repository, base, rev[0]);
+	if (ret < 0)
+		exit(128);
+	if (!ret) {
 		if (die_on_failure) {
 			die(_("base commit should be the ancestor of revision list"));
 		} else {
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index e68b7fe45d7..0308fd73289 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -103,12 +103,16 @@ static int handle_octopus(int count, const char **args, int show_all)
 static int handle_is_ancestor(int argc, const char **argv)
 {
 	struct commit *one, *two;
+	int ret;
 
 	if (argc != 2)
 		die("--is-ancestor takes exactly two commits");
 	one = get_commit_reference(argv[0]);
 	two = get_commit_reference(argv[1]);
-	if (repo_in_merge_bases(the_repository, one, two))
+	ret = repo_in_merge_bases(the_repository, one, two);
+	if (ret < 0)
+		exit(128);
+	if (ret)
 		return 0;
 	else
 		return 1;
diff --git a/builtin/pull.c b/builtin/pull.c
index be2b2c9ebc9..e6f2942c0c5 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -931,6 +931,8 @@ static int get_can_ff(struct object_id *orig_head,
 	merge_head = lookup_commit_reference(the_repository, orig_merge_head);
 	ret = repo_is_descendant_of(the_repository, merge_head, list);
 	free_commit_list(list);
+	if (ret < 0)
+		exit(128);
 	return ret;
 }
 
@@ -955,6 +957,8 @@ static int already_up_to_date(struct object_id *orig_head,
 		commit_list_insert(theirs, &list);
 		ok = repo_is_descendant_of(the_repository, ours, list);
 		free_commit_list(list);
+		if (ok < 0)
+			exit(128);
 		if (!ok)
 			return 0;
 	}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8c4f0cb90a9..956fea6293e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1546,6 +1546,7 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 	    starts_with(name, "refs/heads/")) {
 		struct object *old_object, *new_object;
 		struct commit *old_commit, *new_commit;
+		int ret2;
 
 		old_object = parse_object(the_repository, old_oid);
 		new_object = parse_object(the_repository, new_oid);
@@ -1559,7 +1560,10 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 		}
 		old_commit = (struct commit *)old_object;
 		new_commit = (struct commit *)new_object;
-		if (!repo_in_merge_bases(the_repository, old_commit, new_commit)) {
+		ret2 = repo_in_merge_bases(the_repository, old_commit, new_commit);
+		if (ret2 < 0)
+			exit(128);
+		if (!ret2) {
 			rp_error("denying non-fast-forward %s"
 				 " (you should pull first)", name);
 			ret = "non-fast-forward";
diff --git a/commit-reach.c b/commit-reach.c
index b2b102926c9..dab32eb470d 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -463,11 +463,13 @@ int repo_is_descendant_of(struct repository *r,
 	} else {
 		while (with_commit) {
 			struct commit *other;
+			int ret;
 
 			other = with_commit->item;
 			with_commit = with_commit->next;
-			if (repo_in_merge_bases_many(r, other, 1, &commit))
-				return 1;
+			ret = repo_in_merge_bases_many(r, other, 1, &commit);
+			if (ret)
+				return ret;
 		}
 		return 0;
 	}
@@ -484,10 +486,10 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
 	if (repo_parse_commit(r, commit))
-		return ret;
+		return -1;
 	for (i = 0; i < nr_reference; i++) {
 		if (repo_parse_commit(r, reference[i]))
-			return ret;
+			return -1;
 
 		generation = commit_graph_generation(reference[i]);
 		if (generation > max_generation)
@@ -596,6 +598,8 @@ int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid)
 	commit_list_insert(old_commit, &old_commit_list);
 	ret = repo_is_descendant_of(the_repository,
 				    new_commit, old_commit_list);
+	if (ret < 0)
+		exit(128);
 	free_commit_list(old_commit_list);
 	return ret;
 }
diff --git a/http-push.c b/http-push.c
index a704f490fdb..85fa2f457d4 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1576,8 +1576,11 @@ static int verify_merge_base(struct object_id *head_oid, struct ref *remote)
 	struct commit *head = lookup_commit_or_die(head_oid, "HEAD");
 	struct commit *branch = lookup_commit_or_die(&remote->old_oid,
 						     remote->name);
+	int i = repo_in_merge_bases(the_repository, branch, head);
 
-	return repo_in_merge_bases(the_repository, branch, head);
+	if (i < 0)
+		exit(128);
+	return i;
 }
 
 static int delete_remote_branch(const char *pattern, int force)
diff --git a/merge-ort.c b/merge-ort.c
index 6491070d965..64e76afe89f 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -544,6 +544,7 @@ enum conflict_and_info_types {
 	CONFLICT_SUBMODULE_HISTORY_NOT_AVAILABLE,
 	CONFLICT_SUBMODULE_MAY_HAVE_REWINDS,
 	CONFLICT_SUBMODULE_NULL_MERGE_BASE,
+	CONFLICT_SUBMODULE_CORRUPT,
 
 	/* Keep this entry _last_ in the list */
 	NB_CONFLICT_TYPES,
@@ -596,7 +597,9 @@ static const char *type_short_descriptions[] = {
 	[CONFLICT_SUBMODULE_MAY_HAVE_REWINDS] =
 		"CONFLICT (submodule may have rewinds)",
 	[CONFLICT_SUBMODULE_NULL_MERGE_BASE] =
-		"CONFLICT (submodule lacks merge base)"
+		"CONFLICT (submodule lacks merge base)",
+	[CONFLICT_SUBMODULE_CORRUPT] =
+		"CONFLICT (submodule corrupt)"
 };
 
 struct logical_conflict_info {
@@ -1710,7 +1713,11 @@ static int find_first_merges(struct repository *repo,
 		die("revision walk setup failed");
 	while ((commit = get_revision(&revs)) != NULL) {
 		struct object *o = &(commit->object);
-		if (repo_in_merge_bases(repo, b, commit))
+		int ret = repo_in_merge_bases(repo, b, commit);
+
+		if (ret < 0)
+			return ret;
+		if (ret > 0)
 			add_object_array(o, NULL, &merges);
 	}
 	reset_revision_walk();
@@ -1725,9 +1732,14 @@ static int find_first_merges(struct repository *repo,
 		contains_another = 0;
 		for (j = 0; j < merges.nr; j++) {
 			struct commit *m2 = (struct commit *) merges.objects[j].item;
-			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
-				contains_another = 1;
-				break;
+			if (i != j) {
+				int ret = repo_in_merge_bases(repo, m2, m1);
+				if (ret < 0)
+					return ret;
+				if (ret > 0) {
+					contains_another = 1;
+					break;
+				}
 			}
 		}
 
@@ -1749,7 +1761,7 @@ static int merge_submodule(struct merge_options *opt,
 {
 	struct repository subrepo;
 	struct strbuf sb = STRBUF_INIT;
-	int ret = 0;
+	int ret = 0, ret2;
 	struct commit *commit_o, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
@@ -1796,8 +1808,26 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* check whether both changes are forward */
-	if (!repo_in_merge_bases(&subrepo, commit_o, commit_a) ||
-	    !repo_in_merge_bases(&subrepo, commit_o, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_a);
+	if (ret2 < 1) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (!ret2)
+		ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_b);
+	if (ret2 < 1) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (!ret2) {
 		path_msg(opt, CONFLICT_SUBMODULE_MAY_HAVE_REWINDS, 0,
 			 path, NULL, NULL, NULL,
 			 _("Failed to merge submodule %s "
@@ -1807,7 +1837,16 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* Case #1: a is contained in b or vice versa */
-	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0) {
 		oidcpy(result, b);
 		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
 			 path, NULL, NULL, NULL,
@@ -1816,7 +1855,16 @@ static int merge_submodule(struct merge_options *opt,
 		ret = 1;
 		goto cleanup;
 	}
-	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0) {
 		oidcpy(result, a);
 		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
 			 path, NULL, NULL, NULL,
@@ -1841,6 +1889,13 @@ static int merge_submodule(struct merge_options *opt,
 	parent_count = find_first_merges(&subrepo, path, commit_a, commit_b,
 					 &merges);
 	switch (parent_count) {
+	case -1:
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		break;
 	case 0:
 		path_msg(opt, CONFLICT_SUBMODULE_FAILED_TO_MERGE, 0,
 			 path, NULL, NULL, NULL,
diff --git a/merge-recursive.c b/merge-recursive.c
index e3beb0801b1..e3fe7803cbe 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1144,7 +1144,10 @@ static int find_first_merges(struct repository *repo,
 		die("revision walk setup failed");
 	while ((commit = get_revision(&revs)) != NULL) {
 		struct object *o = &(commit->object);
-		if (repo_in_merge_bases(repo, b, commit))
+		int ret = repo_in_merge_bases(repo, b, commit);
+		if (ret < 0)
+			return ret;
+		if (ret)
 			add_object_array(o, NULL, &merges);
 	}
 	reset_revision_walk();
@@ -1159,9 +1162,14 @@ static int find_first_merges(struct repository *repo,
 		contains_another = 0;
 		for (j = 0; j < merges.nr; j++) {
 			struct commit *m2 = (struct commit *) merges.objects[j].item;
-			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
-				contains_another = 1;
-				break;
+			if (i != j) {
+				int ret = repo_in_merge_bases(repo, m2, m1);
+				if (ret < 0)
+					return ret;
+				if (ret > 0) {
+					contains_another = 1;
+					break;
+				}
 			}
 		}
 
@@ -1197,7 +1205,7 @@ static int merge_submodule(struct merge_options *opt,
 			   const struct object_id *b)
 {
 	struct repository subrepo;
-	int ret = 0;
+	int ret = 0, ret2;
 	struct commit *commit_base, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
@@ -1234,14 +1242,29 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* check whether both changes are forward */
-	if (!repo_in_merge_bases(&subrepo, commit_base, commit_a) ||
-	    !repo_in_merge_bases(&subrepo, commit_base, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_a);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2)
+		ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_b);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (!ret2) {
 		output(opt, 1, _("Failed to merge submodule %s (commits don't follow merge-base)"), path);
 		goto cleanup;
 	}
 
 	/* Case #1: a is contained in b or vice versa */
-	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2) {
 		oidcpy(result, b);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
@@ -1254,7 +1277,12 @@ static int merge_submodule(struct merge_options *opt,
 		ret = 1;
 		goto cleanup;
 	}
-	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2) {
 		oidcpy(result, a);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
@@ -1402,6 +1430,8 @@ static int merge_mode_and_contents(struct merge_options *opt,
 							&o->oid,
 							&a->oid,
 							&b->oid);
+			if (result->clean < 0)
+				return -1;
 		} else if (S_ISLNK(a->mode)) {
 			switch (opt->recursive_variant) {
 			case MERGE_VARIANT_NORMAL:
diff --git a/shallow.c b/shallow.c
index ac728cdd778..cf4b95114b7 100644
--- a/shallow.c
+++ b/shallow.c
@@ -795,12 +795,16 @@ static void post_assign_shallow(struct shallow_info *info,
 		if (!*bitmap)
 			continue;
 		for (j = 0; j < bitmap_nr; j++)
-			if (bitmap[0][j] &&
-			    /* Step 7, reachability test at commit level */
-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits)) {
-				update_refstatus(ref_status, info->ref->nr, *bitmap);
-				dst++;
-				break;
+			if (bitmap[0][j]) {
+				/* Step 7, reachability test at commit level */
+				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits);
+				if (ret < 0)
+					exit(128);
+				if (!ret) {
+					update_refstatus(ref_status, info->ref->nr, *bitmap);
+					dst++;
+					break;
+				}
 			}
 	}
 	info->nr_ours = dst;
@@ -829,6 +833,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 							    commit,
 							    si->nr_commits,
 							    si->commits);
+		if (si->reachable[c] < 0)
+			exit(128);
 		si->need_reachability_test[c] = 0;
 	}
 	return si->reachable[c];
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 03/12] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 01/12] paint_down_to_common: plug a memory leak Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 02/12] Let `repo_in_merge_bases()` report missing commits Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 04/12] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Currently this function treats unrelated commit histories the same way
as commit histories with missing commit objects.

Typically, missing commit objects constitute a corrupt repository,
though, and should be reported as such. The next commits will make it
so, but there is one exception: In `git fetch --update-shallow` we
_expect_ commit objects to be missing, and we do want to treat the
now-incomplete commit histories as unrelated.

To allow for that, let's introduce an additional parameter that is
passed to `repo_in_merge_bases_many()` to trigger this behavior, and use
it in the two callers in `shallow.c`.

This does not change behavior in this commit, but prepares in an
easy-to-review way for the upcoming changes that will make the merge
base logic more stringent with regards to missing commit objects.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c        | 9 +++++----
 commit-reach.h        | 3 ++-
 remote.c              | 2 +-
 shallow.c             | 5 +++--
 t/helper/test-reach.c | 2 +-
 5 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index dab32eb470d..248a0f2b39d 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -467,7 +467,7 @@ int repo_is_descendant_of(struct repository *r,
 
 			other = with_commit->item;
 			with_commit = with_commit->next;
-			ret = repo_in_merge_bases_many(r, other, 1, &commit);
+			ret = repo_in_merge_bases_many(r, other, 1, &commit, 0);
 			if (ret)
 				return ret;
 		}
@@ -479,17 +479,18 @@ int repo_is_descendant_of(struct repository *r,
  * Is "commit" an ancestor of one of the "references"?
  */
 int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
-			     int nr_reference, struct commit **reference)
+			     int nr_reference, struct commit **reference,
+			     int ignore_missing_commits)
 {
 	struct commit_list *bases;
 	int ret = 0, i;
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
 	if (repo_parse_commit(r, commit))
-		return -1;
+		return ignore_missing_commits ? 0 : -1;
 	for (i = 0; i < nr_reference; i++) {
 		if (repo_parse_commit(r, reference[i]))
-			return -1;
+			return ignore_missing_commits ? 0 : -1;
 
 		generation = commit_graph_generation(reference[i]);
 		if (generation > max_generation)
diff --git a/commit-reach.h b/commit-reach.h
index 35c4da49481..68f81549a44 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -30,7 +30,8 @@ int repo_in_merge_bases(struct repository *r,
 			struct commit *reference);
 int repo_in_merge_bases_many(struct repository *r,
 			     struct commit *commit,
-			     int nr_reference, struct commit **reference);
+			     int nr_reference, struct commit **reference,
+			     int ignore_missing_commits);
 
 /*
  * Takes a list of commits and returns a new list where those
diff --git a/remote.c b/remote.c
index abb24822beb..763c80f4a7d 100644
--- a/remote.c
+++ b/remote.c
@@ -2675,7 +2675,7 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
 		if (MERGE_BASES_BATCH_SIZE < size)
 			size = MERGE_BASES_BATCH_SIZE;
 
-		if ((ret = repo_in_merge_bases_many(the_repository, commit, size, chunk)))
+		if ((ret = repo_in_merge_bases_many(the_repository, commit, size, chunk, 0)))
 			break;
 	}
 
diff --git a/shallow.c b/shallow.c
index cf4b95114b7..f71496f35c3 100644
--- a/shallow.c
+++ b/shallow.c
@@ -797,7 +797,7 @@ static void post_assign_shallow(struct shallow_info *info,
 		for (j = 0; j < bitmap_nr; j++)
 			if (bitmap[0][j]) {
 				/* Step 7, reachability test at commit level */
-				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits);
+				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1);
 				if (ret < 0)
 					exit(128);
 				if (!ret) {
@@ -832,7 +832,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 		si->reachable[c] = repo_in_merge_bases_many(the_repository,
 							    commit,
 							    si->nr_commits,
-							    si->commits);
+							    si->commits,
+							    1);
 		if (si->reachable[c] < 0)
 			exit(128);
 		si->need_reachability_test[c] = 0;
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index 3e173399a00..aa816e168ea 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -113,7 +113,7 @@ int cmd__reach(int ac, const char **av)
 		       repo_in_merge_bases(the_repository, A, B));
 	else if (!strcmp(av[1], "in_merge_bases_many"))
 		printf("%s(A,X):%d\n", av[1],
-		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array));
+		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array, 0));
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 04/12] Prepare `paint_down_to_common()` for handling shallow commits
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (2 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 03/12] Prepare `repo_in_merge_bases_many()` to optionally expect " Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 05/12] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When `git fetch --update-shallow` needs to test for commit ancestry, it
can naturally run into a missing object (e.g. if it is a parent of a
shallow commit). To accommodate, the merge base logic will need to be
able to handle this situation gracefully.

Currently, that logic pretends that a missing parent commit is
equivalent to a a missing parent commit, and for the purpose of
`--update-shallow` that is exactly what we need it to do.

Therefore, let's introduce a flag to indicate when that is precisely the
logic we want.

We need a flag, and cannot rely on `is_repository_shallow()` to indicate
that situation, because that function would return 0: There may not
actually be a `shallow` file, as demonstrated e.g. by t5537.10 ("add new
shallow root with receive.updateshallow on") and t5538.4 ("add new
shallow root with receive.updateshallow on").

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 248a0f2b39d..1d1d8c989de 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -53,7 +53,8 @@ static int queue_has_nonstale(struct prio_queue *queue)
 static struct commit_list *paint_down_to_common(struct repository *r,
 						struct commit *one, int n,
 						struct commit **twos,
-						timestamp_t min_generation)
+						timestamp_t min_generation,
+						int ignore_missing_commits)
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *result = NULL;
@@ -106,6 +107,13 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 			if ((p->object.flags & flags) == flags)
 				continue;
 			if (repo_parse_commit(r, p)) {
+				/*
+				 * At this stage, we know that the commit is
+				 * missing: `repo_parse_commit()` uses
+				 * `OBJECT_INFO_DIE_IF_CORRUPT` and therefore
+				 * corrupt commits would already have been
+				 * dispatched with a `die()`.
+				 */
 				free_commit_list(result);
 				return NULL;
 			}
@@ -142,7 +150,7 @@ static struct commit_list *merge_bases_many(struct repository *r,
 			return NULL;
 	}
 
-	list = paint_down_to_common(r, one, n, twos, 0);
+	list = paint_down_to_common(r, one, n, twos, 0, 0);
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -213,7 +221,7 @@ static int remove_redundant_no_gen(struct repository *r,
 				min_generation = curr_generation;
 		}
 		common = paint_down_to_common(r, array[i], filled,
-					      work, min_generation);
+					      work, min_generation, 0);
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -503,7 +511,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 
 	bases = paint_down_to_common(r, commit,
 				     nr_reference, reference,
-				     generation);
+				     generation, ignore_missing_commits);
 	if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 05/12] commit-reach: start reporting errors in `paint_down_to_common()`
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (3 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 04/12] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 06/12] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

If a commit cannot be parsed, it is currently ignored when looking for
merge bases. That's undesirable as the operation can pretend success in
a corrupt repository, even though the command should fail with an error
message.

Let's start at the bottom of the stack by teaching the
`paint_down_to_common()` function to return an `int`: if negative, it
indicates fatal error, if 0 success.

This requires a couple of callers to be adjusted accordingly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 56 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 1d1d8c989de..dafe117036b 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -50,14 +50,14 @@ static int queue_has_nonstale(struct prio_queue *queue)
 }
 
 /* all input commits in one and twos[] must have been parsed! */
-static struct commit_list *paint_down_to_common(struct repository *r,
-						struct commit *one, int n,
-						struct commit **twos,
-						timestamp_t min_generation,
-						int ignore_missing_commits)
+static int paint_down_to_common(struct repository *r,
+				struct commit *one, int n,
+				struct commit **twos,
+				timestamp_t min_generation,
+				int ignore_missing_commits,
+				struct commit_list **result)
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
-	struct commit_list *result = NULL;
 	int i;
 	timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
 
@@ -66,8 +66,8 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 
 	one->object.flags |= PARENT1;
 	if (!n) {
-		commit_list_append(one, &result);
-		return result;
+		commit_list_append(one, result);
+		return 0;
 	}
 	prio_queue_put(&queue, one);
 
@@ -95,7 +95,7 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 		if (flags == (PARENT1 | PARENT2)) {
 			if (!(commit->object.flags & RESULT)) {
 				commit->object.flags |= RESULT;
-				commit_list_insert_by_date(commit, &result);
+				commit_list_insert_by_date(commit, result);
 			}
 			/* Mark parents of a found merge stale */
 			flags |= STALE;
@@ -114,8 +114,11 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 				 * corrupt commits would already have been
 				 * dispatched with a `die()`.
 				 */
-				free_commit_list(result);
-				return NULL;
+				free_commit_list(*result);
+				if (ignore_missing_commits)
+					return 0;
+				return error(_("could not parse commit %s"),
+					     oid_to_hex(&p->object.oid));
 			}
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
@@ -123,7 +126,7 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 	}
 
 	clear_prio_queue(&queue);
-	return result;
+	return 0;
 }
 
 static struct commit_list *merge_bases_many(struct repository *r,
@@ -150,7 +153,8 @@ static struct commit_list *merge_bases_many(struct repository *r,
 			return NULL;
 	}
 
-	list = paint_down_to_common(r, one, n, twos, 0, 0);
+	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0)
+		return NULL;
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -204,7 +208,7 @@ static int remove_redundant_no_gen(struct repository *r,
 	for (i = 0; i < cnt; i++)
 		repo_parse_commit(r, array[i]);
 	for (i = 0; i < cnt; i++) {
-		struct commit_list *common;
+		struct commit_list *common = NULL;
 		timestamp_t min_generation = commit_graph_generation(array[i]);
 
 		if (redundant[i])
@@ -220,8 +224,9 @@ static int remove_redundant_no_gen(struct repository *r,
 			if (curr_generation < min_generation)
 				min_generation = curr_generation;
 		}
-		common = paint_down_to_common(r, array[i], filled,
-					      work, min_generation, 0);
+		if (paint_down_to_common(r, array[i], filled,
+					 work, min_generation, 0, &common))
+			return -1;
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -421,6 +426,10 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	clear_commit_marks_many(n, twos, all_flags);
 
 	cnt = remove_redundant(r, rslt, cnt);
+	if (cnt < 0) {
+		free(rslt);
+		return NULL;
+	}
 	result = NULL;
 	for (i = 0; i < cnt; i++)
 		commit_list_insert_by_date(rslt[i], &result);
@@ -490,7 +499,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 			     int nr_reference, struct commit **reference,
 			     int ignore_missing_commits)
 {
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	int ret = 0, i;
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
@@ -509,10 +518,11 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 	if (generation > max_generation)
 		return ret;
 
-	bases = paint_down_to_common(r, commit,
-				     nr_reference, reference,
-				     generation, ignore_missing_commits);
-	if (commit->object.flags & PARENT2)
+	if (paint_down_to_common(r, commit,
+				 nr_reference, reference,
+				 generation, ignore_missing_commits, &bases))
+		ret = -1;
+	else if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
 	clear_commit_marks_many(nr_reference, reference, all_flags);
@@ -565,6 +575,10 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 		}
 	}
 	num_head = remove_redundant(the_repository, array, num_head);
+	if (num_head < 0) {
+		free(array);
+		return NULL;
+	}
 	for (i = 0; i < num_head; i++)
 		tail = &commit_list_insert(array[i], tail)->next;
 	free(array);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 06/12] merge_bases_many(): pass on errors from `paint_down_to_common()`
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (4 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 05/12] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 07/12] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `paint_down_to_common()` function was just taught to indicate
parsing errors, and now the `merge_bases_many()` function is aware of
that, too.

One tricky aspect is that `merge_bases_many()` parses commits of its
own, but wants to gracefully handle the scenario where NULL is passed as
a merge head, returning the empty list of merge bases. The way this was
handled involved calling `repo_parse_commit(NULL)` and relying on it to
return an error. This has to be done differently now so that we can
handle missing commits correctly by producing a fatal error.

Next step: adjust the caller of `merge_bases_many()`:
`get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index dafe117036b..c9969da8c6c 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -129,39 +129,47 @@ static int paint_down_to_common(struct repository *r,
 	return 0;
 }
 
-static struct commit_list *merge_bases_many(struct repository *r,
-					    struct commit *one, int n,
-					    struct commit **twos)
+static int merge_bases_many(struct repository *r,
+			    struct commit *one, int n,
+			    struct commit **twos,
+			    struct commit_list **result)
 {
 	struct commit_list *list = NULL;
-	struct commit_list *result = NULL;
 	int i;
 
 	for (i = 0; i < n; i++) {
-		if (one == twos[i])
+		if (one == twos[i]) {
 			/*
 			 * We do not mark this even with RESULT so we do not
 			 * have to clean it up.
 			 */
-			return commit_list_insert(one, &result);
+			*result = commit_list_insert(one, result);
+			return 0;
+		}
 	}
 
+	if (!one)
+		return 0;
 	if (repo_parse_commit(r, one))
-		return NULL;
+		return error(_("could not parse commit %s"),
+			     oid_to_hex(&one->object.oid));
 	for (i = 0; i < n; i++) {
+		if (!twos[i])
+			return 0;
 		if (repo_parse_commit(r, twos[i]))
-			return NULL;
+			return error(_("could not parse commit %s"),
+				     oid_to_hex(&twos[i]->object.oid));
 	}
 
 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0)
-		return NULL;
+		return -1;
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
 		if (!(commit->object.flags & STALE))
-			commit_list_insert_by_date(commit, &result);
+			commit_list_insert_by_date(commit, result);
 	}
-	return result;
+	return 0;
 }
 
 struct commit_list *get_octopus_merge_bases(struct commit_list *in)
@@ -399,10 +407,11 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 {
 	struct commit_list *list;
 	struct commit **rslt;
-	struct commit_list *result;
+	struct commit_list *result = NULL;
 	int cnt, i;
 
-	result = merge_bases_many(r, one, n, twos);
+	if (merge_bases_many(r, one, n, twos, &result) < 0)
+		return NULL;
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
 			return result;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 07/12] get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (5 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 06/12] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 08/12] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate
parsing errors, and now the `get_merge_bases_many_0()` function is aware
of that, too.

Next step: adjust the callers of `get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 57 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index c9969da8c6c..359853275a9 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -399,37 +399,38 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt
 	return remove_redundant_no_gen(r, array, cnt);
 }
 
-static struct commit_list *get_merge_bases_many_0(struct repository *r,
-						  struct commit *one,
-						  int n,
-						  struct commit **twos,
-						  int cleanup)
+static int get_merge_bases_many_0(struct repository *r,
+				  struct commit *one,
+				  int n,
+				  struct commit **twos,
+				  int cleanup,
+				  struct commit_list **result)
 {
 	struct commit_list *list;
 	struct commit **rslt;
-	struct commit_list *result = NULL;
 	int cnt, i;
 
-	if (merge_bases_many(r, one, n, twos, &result) < 0)
-		return NULL;
+	if (merge_bases_many(r, one, n, twos, result) < 0)
+		return -1;
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
-			return result;
+			return 0;
 	}
-	if (!result || !result->next) {
+	if (!*result || !(*result)->next) {
 		if (cleanup) {
 			clear_commit_marks(one, all_flags);
 			clear_commit_marks_many(n, twos, all_flags);
 		}
-		return result;
+		return 0;
 	}
 
 	/* There are more than one */
-	cnt = commit_list_count(result);
+	cnt = commit_list_count(*result);
 	CALLOC_ARRAY(rslt, cnt);
-	for (list = result, i = 0; list; list = list->next)
+	for (list = *result, i = 0; list; list = list->next)
 		rslt[i++] = list->item;
-	free_commit_list(result);
+	free_commit_list(*result);
+	*result = NULL;
 
 	clear_commit_marks(one, all_flags);
 	clear_commit_marks_many(n, twos, all_flags);
@@ -437,13 +438,12 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	cnt = remove_redundant(r, rslt, cnt);
 	if (cnt < 0) {
 		free(rslt);
-		return NULL;
+		return -1;
 	}
-	result = NULL;
 	for (i = 0; i < cnt; i++)
-		commit_list_insert_by_date(rslt[i], &result);
+		commit_list_insert_by_date(rslt[i], result);
 	free(rslt);
-	return result;
+	return 0;
 }
 
 struct commit_list *repo_get_merge_bases_many(struct repository *r,
@@ -451,7 +451,12 @@ struct commit_list *repo_get_merge_bases_many(struct repository *r,
 					      int n,
 					      struct commit **twos)
 {
-	return get_merge_bases_many_0(r, one, n, twos, 1);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, n, twos, 1, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
@@ -459,14 +464,24 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    int n,
 						    struct commit **twos)
 {
-	return get_merge_bases_many_0(r, one, n, twos, 0);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, n, twos, 0, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 struct commit_list *repo_get_merge_bases(struct repository *r,
 					 struct commit *one,
 					 struct commit *two)
 {
-	return get_merge_bases_many_0(r, one, 1, &two, 1);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, 1, &two, 1, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 /*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 08/12] repo_get_merge_bases(): pass on errors from `merge_bases_many()`
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (6 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 07/12] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 09/12] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `repo_get_merge_bases()` macro) is aware of that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next step: adjust the callers of `get_octopus_merge_bases()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/log.c                    | 10 +++++-----
 builtin/merge-tree.c             |  5 +++--
 builtin/merge.c                  | 20 ++++++++++++--------
 builtin/rebase.c                 |  8 +++++---
 builtin/rev-parse.c              |  5 +++--
 commit-reach.c                   | 23 +++++++++++------------
 commit-reach.h                   |  7 ++++---
 diff-lib.c                       |  5 +++--
 log-tree.c                       |  5 +++--
 merge-ort.c                      |  6 +++++-
 merge-recursive.c                |  4 +++-
 notes-merge.c                    |  3 ++-
 object-name.c                    |  5 +++--
 revision.c                       | 10 ++++++----
 sequencer.c                      |  8 ++++++--
 submodule.c                      |  7 ++++++-
 t/t4301-merge-tree-write-tree.sh | 12 ++++++++++++
 17 files changed, 92 insertions(+), 51 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 1705da71aca..befafd6ae04 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1702,11 +1702,11 @@ static struct commit *get_base_commit(const char *base_commit,
 	 */
 	while (rev_nr > 1) {
 		for (i = 0; i < rev_nr / 2; i++) {
-			struct commit_list *merge_base;
-			merge_base = repo_get_merge_bases(the_repository,
-							  rev[2 * i],
-							  rev[2 * i + 1]);
-			if (!merge_base || merge_base->next) {
+			struct commit_list *merge_base = NULL;
+			if (repo_get_merge_bases(the_repository,
+						 rev[2 * i],
+						 rev[2 * i + 1], &merge_base) < 0 ||
+			    !merge_base || merge_base->next) {
 				if (die_on_failure) {
 					die(_("failed to find exact merge base"));
 				} else {
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index a35e0452d66..76200250629 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -463,8 +463,9 @@ static int real_merge(struct merge_tree_options *o,
 		 * Get the merge bases, in reverse order; see comment above
 		 * merge_incore_recursive in merge-ort.h
 		 */
-		merge_bases = repo_get_merge_bases(the_repository, parent1,
-						   parent2);
+		if (repo_get_merge_bases(the_repository, parent1,
+					 parent2, &merge_bases) < 0)
+			exit(128);
 		if (!merge_bases && !o->allow_unrelated_histories)
 			die(_("refusing to merge unrelated histories"));
 		merge_bases = reverse_commit_list(merge_bases);
diff --git a/builtin/merge.c b/builtin/merge.c
index d748d46e135..ac9d58adc29 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1517,10 +1517,13 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 
 	if (!remoteheads)
 		; /* already up-to-date */
-	else if (!remoteheads->next)
-		common = repo_get_merge_bases(the_repository, head_commit,
-					      remoteheads->item);
-	else {
+	else if (!remoteheads->next) {
+		if (repo_get_merge_bases(the_repository, head_commit,
+					 remoteheads->item, &common) < 0) {
+			ret = 2;
+			goto done;
+		}
+	} else {
 		struct commit_list *list = remoteheads;
 		commit_list_insert(head_commit, &list);
 		common = get_octopus_merge_bases(list);
@@ -1631,7 +1634,7 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 		struct commit_list *j;
 
 		for (j = remoteheads; j; j = j->next) {
-			struct commit_list *common_one;
+			struct commit_list *common_one = NULL;
 			struct commit *common_item;
 
 			/*
@@ -1639,9 +1642,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 			 * merge_bases again, otherwise "git merge HEAD^
 			 * HEAD^^" would be missed.
 			 */
-			common_one = repo_get_merge_bases(the_repository,
-							  head_commit,
-							  j->item);
+			if (repo_get_merge_bases(the_repository, head_commit,
+						 j->item, &common_one) < 0)
+				exit(128);
+
 			common_item = common_one->item;
 			free_commit_list(common_one);
 			if (!oideq(&common_item->object.oid, &j->item->object.oid)) {
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 043c65dccd9..06a55fc7325 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -879,7 +879,8 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream,
 	if (!upstream)
 		goto done;
 
-	merge_bases = repo_get_merge_bases(the_repository, upstream, head);
+	if (repo_get_merge_bases(the_repository, upstream, head, &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases || merge_bases->next)
 		goto done;
 
@@ -898,8 +899,9 @@ static void fill_branch_base(struct rebase_options *options,
 {
 	struct commit_list *merge_bases = NULL;
 
-	merge_bases = repo_get_merge_bases(the_repository, options->onto,
-					   options->orig_head);
+	if (repo_get_merge_bases(the_repository, options->onto,
+				 options->orig_head, &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases || merge_bases->next)
 		oidcpy(branch_base, null_oid());
 	else
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index fde8861ca4e..c97d0f6144c 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -297,7 +297,7 @@ static int try_difference(const char *arg)
 		show_rev(NORMAL, &end_oid, end);
 		show_rev(symmetric ? NORMAL : REVERSED, &start_oid, start);
 		if (symmetric) {
-			struct commit_list *exclude;
+			struct commit_list *exclude = NULL;
 			struct commit *a, *b;
 			a = lookup_commit_reference(the_repository, &start_oid);
 			b = lookup_commit_reference(the_repository, &end_oid);
@@ -305,7 +305,8 @@ static int try_difference(const char *arg)
 				*dotdot = '.';
 				return 0;
 			}
-			exclude = repo_get_merge_bases(the_repository, a, b);
+			if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0)
+				exit(128);
 			while (exclude) {
 				struct commit *commit = pop_commit(&exclude);
 				show_rev(REVERSED, &commit->object.oid, NULL);
diff --git a/commit-reach.c b/commit-reach.c
index 359853275a9..3471c933f6c 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -185,9 +185,12 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 		struct commit_list *new_commits = NULL, *end = NULL;
 
 		for (j = ret; j; j = j->next) {
-			struct commit_list *bases;
-			bases = repo_get_merge_bases(the_repository, i->item,
-						     j->item);
+			struct commit_list *bases = NULL;
+			if (repo_get_merge_bases(the_repository, i->item,
+						 j->item, &bases) < 0) {
+				free_commit_list(bases);
+				return NULL;
+			}
 			if (!new_commits)
 				new_commits = bases;
 			else
@@ -472,16 +475,12 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 	return result;
 }
 
-struct commit_list *repo_get_merge_bases(struct repository *r,
-					 struct commit *one,
-					 struct commit *two)
+int repo_get_merge_bases(struct repository *r,
+			 struct commit *one,
+			 struct commit *two,
+			 struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, 1, &two, 1, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, 1, &two, 1, result);
 }
 
 /*
diff --git a/commit-reach.h b/commit-reach.h
index 68f81549a44..2c6fcdd34f6 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -9,9 +9,10 @@ struct ref_filter;
 struct object_id;
 struct object_array;
 
-struct commit_list *repo_get_merge_bases(struct repository *r,
-					 struct commit *rev1,
-					 struct commit *rev2);
+int repo_get_merge_bases(struct repository *r,
+			 struct commit *rev1,
+			 struct commit *rev2,
+			 struct commit_list **result);
 struct commit_list *repo_get_merge_bases_many(struct repository *r,
 					      struct commit *one, int n,
 					      struct commit **twos);
diff --git a/diff-lib.c b/diff-lib.c
index 0e9ec4f68af..498224ccce2 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -565,7 +565,7 @@ void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb)
 {
 	int i;
 	struct commit *mb_child[2] = {0};
-	struct commit_list *merge_bases;
+	struct commit_list *merge_bases = NULL;
 
 	for (i = 0; i < revs->pending.nr; i++) {
 		struct object *obj = revs->pending.objects[i].item;
@@ -592,7 +592,8 @@ void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb)
 		mb_child[1] = lookup_commit_reference(the_repository, &oid);
 	}
 
-	merge_bases = repo_get_merge_bases(the_repository, mb_child[0], mb_child[1]);
+	if (repo_get_merge_bases(the_repository, mb_child[0], mb_child[1], &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases)
 		die(_("no merge base found"));
 	if (merge_bases->next)
diff --git a/log-tree.c b/log-tree.c
index 504da6b519e..4f337766a39 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1010,7 +1010,7 @@ static int do_remerge_diff(struct rev_info *opt,
 			   struct object_id *oid)
 {
 	struct merge_options o;
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	struct merge_result res = {0};
 	struct pretty_print_context ctx = {0};
 	struct commit *parent1 = parents->item;
@@ -1035,7 +1035,8 @@ static int do_remerge_diff(struct rev_info *opt,
 	/* Parse the relevant commits and get the merge bases */
 	parse_commit_or_die(parent1);
 	parse_commit_or_die(parent2);
-	bases = repo_get_merge_bases(the_repository, parent1, parent2);
+	if (repo_get_merge_bases(the_repository, parent1, parent2, &bases) < 0)
+		exit(128);
 
 	/* Re-merge the parents */
 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
diff --git a/merge-ort.c b/merge-ort.c
index 64e76afe89f..88f23a37f5a 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -5060,7 +5060,11 @@ static void merge_ort_internal(struct merge_options *opt,
 	struct strbuf merge_base_abbrev = STRBUF_INIT;
 
 	if (!merge_bases) {
-		merge_bases = repo_get_merge_bases(the_repository, h1, h2);
+		if (repo_get_merge_bases(the_repository, h1, h2,
+					 &merge_bases) < 0) {
+			result->clean = -1;
+			return;
+		}
 		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
 		merge_bases = reverse_commit_list(merge_bases);
 	}
diff --git a/merge-recursive.c b/merge-recursive.c
index e3fe7803cbe..0bed0b97424 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3632,7 +3632,9 @@ static int merge_recursive_internal(struct merge_options *opt,
 	}
 
 	if (!merge_bases) {
-		merge_bases = repo_get_merge_bases(the_repository, h1, h2);
+		if (repo_get_merge_bases(the_repository, h1, h2,
+					 &merge_bases) < 0)
+			return -1;
 		merge_bases = reverse_commit_list(merge_bases);
 	}
 
diff --git a/notes-merge.c b/notes-merge.c
index 8799b522a55..51282934ae6 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -607,7 +607,8 @@ int notes_merge(struct notes_merge_options *o,
 	assert(local && remote);
 
 	/* Find merge bases */
-	bases = repo_get_merge_bases(the_repository, local, remote);
+	if (repo_get_merge_bases(the_repository, local, remote, &bases) < 0)
+		exit(128);
 	if (!bases) {
 		base_oid = null_oid();
 		base_tree_oid = the_hash_algo->empty_tree;
diff --git a/object-name.c b/object-name.c
index 0bfa29dbbfe..89c34f9b49e 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1481,7 +1481,7 @@ int repo_get_oid_mb(struct repository *r,
 		    struct object_id *oid)
 {
 	struct commit *one, *two;
-	struct commit_list *mbs;
+	struct commit_list *mbs = NULL;
 	struct object_id oid_tmp;
 	const char *dots;
 	int st;
@@ -1509,7 +1509,8 @@ int repo_get_oid_mb(struct repository *r,
 	two = lookup_commit_reference_gently(r, &oid_tmp, 0);
 	if (!two)
 		return -1;
-	mbs = repo_get_merge_bases(r, one, two);
+	if (repo_get_merge_bases(r, one, two, &mbs) < 0)
+		return -1;
 	if (!mbs || mbs->next)
 		st = -1;
 	else {
diff --git a/revision.c b/revision.c
index 00d5c29bfce..3492cc95159 100644
--- a/revision.c
+++ b/revision.c
@@ -1965,7 +1965,7 @@ static void add_pending_commit_list(struct rev_info *revs,
 
 static void prepare_show_merge(struct rev_info *revs)
 {
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	struct commit *head, *other;
 	struct object_id oid;
 	const char **prune = NULL;
@@ -1980,7 +1980,8 @@ static void prepare_show_merge(struct rev_info *revs)
 	other = lookup_commit_or_die(&oid, "MERGE_HEAD");
 	add_pending_object(revs, &head->object, "HEAD");
 	add_pending_object(revs, &other->object, "MERGE_HEAD");
-	bases = repo_get_merge_bases(the_repository, head, other);
+	if (repo_get_merge_bases(the_repository, head, other, &bases) < 0)
+		exit(128);
 	add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING | BOTTOM);
 	add_pending_commit_list(revs, bases, UNINTERESTING | BOTTOM);
 	free_commit_list(bases);
@@ -2068,14 +2069,15 @@ static int handle_dotdot_1(const char *arg, char *dotdot,
 	} else {
 		/* A...B -- find merge bases between the two */
 		struct commit *a, *b;
-		struct commit_list *exclude;
+		struct commit_list *exclude = NULL;
 
 		a = lookup_commit_reference(revs->repo, &a_obj->oid);
 		b = lookup_commit_reference(revs->repo, &b_obj->oid);
 		if (!a || !b)
 			return dotdot_missing(arg, dotdot, revs, symmetric);
 
-		exclude = repo_get_merge_bases(the_repository, a, b);
+		if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0)
+			return -1;
 		add_rev_cmdline_list(revs, exclude, REV_CMD_MERGE_BASE,
 				     flags_exclude);
 		add_pending_commit_list(revs, exclude, flags_exclude);
diff --git a/sequencer.c b/sequencer.c
index d584cac8ed9..4417f2f1956 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3913,7 +3913,7 @@ static int do_merge(struct repository *r,
 	int run_commit_flags = 0;
 	struct strbuf ref_name = STRBUF_INIT;
 	struct commit *head_commit, *merge_commit, *i;
-	struct commit_list *bases, *j;
+	struct commit_list *bases = NULL, *j;
 	struct commit_list *to_merge = NULL, **tail = &to_merge;
 	const char *strategy = !opts->xopts.nr &&
 		(!opts->strategy ||
@@ -4139,7 +4139,11 @@ static int do_merge(struct repository *r,
 	}
 
 	merge_commit = to_merge->item;
-	bases = repo_get_merge_bases(r, head_commit, merge_commit);
+	if (repo_get_merge_bases(r, head_commit, merge_commit, &bases) < 0) {
+		ret = -1;
+		goto leave_merge;
+	}
+
 	if (bases && oideq(&merge_commit->object.oid,
 			   &bases->item->object.oid)) {
 		ret = 0;
diff --git a/submodule.c b/submodule.c
index e603a19a876..04931a5474b 100644
--- a/submodule.c
+++ b/submodule.c
@@ -595,7 +595,12 @@ static void show_submodule_header(struct diff_options *o,
 	     (!is_null_oid(two) && !*right))
 		message = "(commits not present)";
 
-	*merge_bases = repo_get_merge_bases(sub, *left, *right);
+	*merge_bases = NULL;
+	if (repo_get_merge_bases(sub, *left, *right, merge_bases) < 0) {
+		message = "(corrupt repository)";
+		goto output_header;
+	}
+
 	if (*merge_bases) {
 		if ((*merge_bases)->item == *left)
 			fast_forward = 1;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index b2c8a43fce3..5d1e7aca4c8 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -945,4 +945,16 @@ test_expect_success 'check the input format when --stdin is passed' '
 	test_cmp expect actual
 '
 
+test_expect_success 'error out on missing commits as well' '
+	git init --bare missing-commit.git &&
+	git rev-list --objects side1 side3 >list-including-initial &&
+	grep -v ^$(git rev-parse side1^) <list-including-initial >list &&
+	git pack-objects missing-commit.git/objects/pack/missing-initial <list &&
+	side1=$(git rev-parse side1) &&
+	side3=$(git rev-parse side3) &&
+	test_must_fail git --git-dir=missing-commit.git \
+		merge-tree --allow-unrelated-histories $side1 $side3 >actual &&
+	test_must_be_empty actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 09/12] get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (7 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 08/12] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 10/12] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `get_merge_bases()` macro) is aware of that, too.

Naturally, the callers need to be adjusted now, too.

Next step: adjust `repo_get_merge_bases_many()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/merge-base.c |  5 +++--
 builtin/merge.c      |  6 +++++-
 builtin/pull.c       |  5 +++--
 commit-reach.c       | 20 +++++++++++---------
 commit-reach.h       |  2 +-
 5 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 0308fd73289..6faabfb6698 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -77,13 +77,14 @@ static int handle_independent(int count, const char **args)
 static int handle_octopus(int count, const char **args, int show_all)
 {
 	struct commit_list *revs = NULL;
-	struct commit_list *result, *rev;
+	struct commit_list *result = NULL, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = get_octopus_merge_bases(revs);
+	if (get_octopus_merge_bases(revs, &result) < 0)
+		return 128;
 	free_commit_list(revs);
 	reduce_heads_replace(&result);
 
diff --git a/builtin/merge.c b/builtin/merge.c
index ac9d58adc29..94c5b693972 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1526,7 +1526,11 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	} else {
 		struct commit_list *list = remoteheads;
 		commit_list_insert(head_commit, &list);
-		common = get_octopus_merge_bases(list);
+		if (get_octopus_merge_bases(list, &common) < 0) {
+			free(list);
+			ret = 2;
+			goto done;
+		}
 		free(list);
 	}
 
diff --git a/builtin/pull.c b/builtin/pull.c
index e6f2942c0c5..0c5a55f2f4d 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -820,7 +820,7 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 		const struct object_id *merge_head,
 		const struct object_id *fork_point)
 {
-	struct commit_list *revs = NULL, *result;
+	struct commit_list *revs = NULL, *result = NULL;
 
 	commit_list_insert(lookup_commit_reference(the_repository, curr_head),
 			   &revs);
@@ -830,7 +830,8 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 		commit_list_insert(lookup_commit_reference(the_repository, fork_point),
 				   &revs);
 
-	result = get_octopus_merge_bases(revs);
+	if (get_octopus_merge_bases(revs, &result) < 0)
+		exit(128);
 	free_commit_list(revs);
 	reduce_heads_replace(&result);
 
diff --git a/commit-reach.c b/commit-reach.c
index 3471c933f6c..1b618eb9cd1 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -172,24 +172,26 @@ static int merge_bases_many(struct repository *r,
 	return 0;
 }
 
-struct commit_list *get_octopus_merge_bases(struct commit_list *in)
+int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result)
 {
-	struct commit_list *i, *j, *k, *ret = NULL;
+	struct commit_list *i, *j, *k;
 
 	if (!in)
-		return ret;
+		return 0;
 
-	commit_list_insert(in->item, &ret);
+	commit_list_insert(in->item, result);
 
 	for (i = in->next; i; i = i->next) {
 		struct commit_list *new_commits = NULL, *end = NULL;
 
-		for (j = ret; j; j = j->next) {
+		for (j = *result; j; j = j->next) {
 			struct commit_list *bases = NULL;
 			if (repo_get_merge_bases(the_repository, i->item,
 						 j->item, &bases) < 0) {
 				free_commit_list(bases);
-				return NULL;
+				free_commit_list(*result);
+				*result = NULL;
+				return -1;
 			}
 			if (!new_commits)
 				new_commits = bases;
@@ -198,10 +200,10 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 			for (k = bases; k; k = k->next)
 				end = k;
 		}
-		free_commit_list(ret);
-		ret = new_commits;
+		free_commit_list(*result);
+		*result = new_commits;
 	}
-	return ret;
+	return 0;
 }
 
 static int remove_redundant_no_gen(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 2c6fcdd34f6..4690b6ecd0c 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -21,7 +21,7 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    struct commit *one, int n,
 						    struct commit **twos);
 
-struct commit_list *get_octopus_merge_bases(struct commit_list *in);
+int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
 
 int repo_is_descendant_of(struct repository *r,
 			  struct commit *commit,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 10/12] repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (8 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 09/12] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 11/12] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many()` function is aware of
that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next stop: `repo_get_merge_bases_dirty()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 bisect.c              |  7 ++++---
 builtin/log.c         | 13 +++++++------
 commit-reach.c        | 16 ++++++----------
 commit-reach.h        |  7 ++++---
 commit.c              |  7 ++++---
 t/helper/test-reach.c |  9 ++++++---
 6 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/bisect.c b/bisect.c
index 1be8e0a2711..2018466d69f 100644
--- a/bisect.c
+++ b/bisect.c
@@ -851,10 +851,11 @@ static void handle_skipped_merge_base(const struct object_id *mb)
 static enum bisect_error check_merge_bases(int rev_nr, struct commit **rev, int no_checkout)
 {
 	enum bisect_error res = BISECT_OK;
-	struct commit_list *result;
+	struct commit_list *result = NULL;
 
-	result = repo_get_merge_bases_many(the_repository, rev[0], rev_nr - 1,
-					   rev + 1);
+	if (repo_get_merge_bases_many(the_repository, rev[0], rev_nr - 1,
+				      rev + 1, &result) < 0)
+		exit(128);
 
 	for (; result; result = result->next) {
 		const struct object_id *mb = &result->item->object.oid;
diff --git a/builtin/log.c b/builtin/log.c
index befafd6ae04..c75790a7cec 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1656,7 +1656,7 @@ static struct commit *get_base_commit(const char *base_commit,
 		struct branch *curr_branch = branch_get(NULL);
 		const char *upstream = branch_get_upstream(curr_branch, NULL);
 		if (upstream) {
-			struct commit_list *base_list;
+			struct commit_list *base_list = NULL;
 			struct commit *commit;
 			struct object_id oid;
 
@@ -1667,11 +1667,12 @@ static struct commit *get_base_commit(const char *base_commit,
 					return NULL;
 			}
 			commit = lookup_commit_or_die(&oid, "upstream base");
-			base_list = repo_get_merge_bases_many(the_repository,
-							      commit, total,
-							      list);
-			/* There should be one and only one merge base. */
-			if (!base_list || base_list->next) {
+			if (repo_get_merge_bases_many(the_repository,
+						      commit, total,
+						      list,
+						      &base_list) < 0 ||
+			    /* There should be one and only one merge base. */
+			    !base_list || base_list->next) {
 				if (die_on_failure) {
 					die(_("could not find exact merge base"));
 				} else {
diff --git a/commit-reach.c b/commit-reach.c
index 1b618eb9cd1..f0006ab6422 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -451,17 +451,13 @@ static int get_merge_bases_many_0(struct repository *r,
 	return 0;
 }
 
-struct commit_list *repo_get_merge_bases_many(struct repository *r,
-					      struct commit *one,
-					      int n,
-					      struct commit **twos)
+int repo_get_merge_bases_many(struct repository *r,
+			      struct commit *one,
+			      int n,
+			      struct commit **twos,
+			      struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, n, twos, 1, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, n, twos, 1, result);
 }
 
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 4690b6ecd0c..458043f4d58 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -13,9 +13,10 @@ int repo_get_merge_bases(struct repository *r,
 			 struct commit *rev1,
 			 struct commit *rev2,
 			 struct commit_list **result);
-struct commit_list *repo_get_merge_bases_many(struct repository *r,
-					      struct commit *one, int n,
-					      struct commit **twos);
+int repo_get_merge_bases_many(struct repository *r,
+			      struct commit *one, int n,
+			      struct commit **twos,
+			      struct commit_list **result);
 /* To be used only when object flags after this call no longer matter */
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    struct commit *one, int n,
diff --git a/commit.c b/commit.c
index 8405d7c3fce..00add5d81c6 100644
--- a/commit.c
+++ b/commit.c
@@ -1054,7 +1054,7 @@ struct commit *get_fork_point(const char *refname, struct commit *commit)
 {
 	struct object_id oid;
 	struct rev_collect revs;
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	int i;
 	struct commit *ret = NULL;
 	char *full_refname;
@@ -1079,8 +1079,9 @@ struct commit *get_fork_point(const char *refname, struct commit *commit)
 	for (i = 0; i < revs.nr; i++)
 		revs.commit[i]->object.flags &= ~TMP_MARK;
 
-	bases = repo_get_merge_bases_many(the_repository, commit, revs.nr,
-					  revs.commit);
+	if (repo_get_merge_bases_many(the_repository, commit, revs.nr,
+				      revs.commit, &bases) < 0)
+		exit(128);
 
 	/*
 	 * There should be one and only one merge base, when we found
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index aa816e168ea..84ee9da8681 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -117,9 +117,12 @@ int cmd__reach(int ac, const char **av)
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
-		struct commit_list *list = repo_get_merge_bases_many(the_repository,
-								     A, X_nr,
-								     X_array);
+		struct commit_list *list = NULL;
+		if (repo_get_merge_bases_many(the_repository,
+					      A, X_nr,
+					      X_array,
+					      &list) < 0)
+			exit(128);
 		printf("%s(A,X):\n", av[1]);
 		print_sorted_commit_ids(list);
 	} else if (!strcmp(av[1], "reduce_heads")) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 11/12] repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (9 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 10/12] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13  8:41 ` [PATCH 12/12] paint_down_to_common(): special-case shallow/partial clones Johannes Schindelin via GitGitGadget
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many_dirty()` function is
aware of that, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/merge-base.c |  7 ++++---
 commit-reach.c       | 16 ++++++----------
 commit-reach.h       |  7 ++++---
 3 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 6faabfb6698..2b6af1bc35b 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -13,10 +13,11 @@
 
 static int show_merge_base(struct commit **rev, int rev_nr, int show_all)
 {
-	struct commit_list *result, *r;
+	struct commit_list *result = NULL, *r;
 
-	result = repo_get_merge_bases_many_dirty(the_repository, rev[0],
-						 rev_nr - 1, rev + 1);
+	if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
+					    rev_nr - 1, rev + 1, &result) < 0)
+		return -1;
 
 	if (!result)
 		return 1;
diff --git a/commit-reach.c b/commit-reach.c
index f0006ab6422..25b39c302a8 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -460,17 +460,13 @@ int repo_get_merge_bases_many(struct repository *r,
 	return get_merge_bases_many_0(r, one, n, twos, 1, result);
 }
 
-struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
-						    struct commit *one,
-						    int n,
-						    struct commit **twos)
+int repo_get_merge_bases_many_dirty(struct repository *r,
+				    struct commit *one,
+				    int n,
+				    struct commit **twos,
+				    struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, n, twos, 0, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, n, twos, 0, result);
 }
 
 int repo_get_merge_bases(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 458043f4d58..bf63cc468fd 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -18,9 +18,10 @@ int repo_get_merge_bases_many(struct repository *r,
 			      struct commit **twos,
 			      struct commit_list **result);
 /* To be used only when object flags after this call no longer matter */
-struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
-						    struct commit *one, int n,
-						    struct commit **twos);
+int repo_get_merge_bases_many_dirty(struct repository *r,
+				    struct commit *one, int n,
+				    struct commit **twos,
+				    struct commit_list **result);
 
 int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 12/12] paint_down_to_common(): special-case shallow/partial clones
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (10 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 11/12] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
@ 2024-02-13  8:41 ` Johannes Schindelin via GitGitGadget
  2024-02-13 18:37 ` [PATCH 00/12] The merge-base logic vs missing commit objects Junio C Hamano
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
  13 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-13  8:41 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In shallow/partial clones, we _expect_ commits to be missing. Let's
teach the merge-base logic to ignore those and simply go ahead and
treat the involved commit histories as cut off at that point.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/commit-reach.c b/commit-reach.c
index 25b39c302a8..4af60c2501d 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -10,6 +10,8 @@
 #include "tag.h"
 #include "commit-reach.h"
 #include "ewah/ewok.h"
+#include "shallow.h"
+#include "promisor-remote.h"
 
 /* Remember to update object flag allocation in object.h */
 #define PARENT1		(1u<<16)
@@ -115,7 +117,9 @@ static int paint_down_to_common(struct repository *r,
 				 * dispatched with a `die()`.
 				 */
 				free_commit_list(*result);
-				if (ignore_missing_commits)
+				if (ignore_missing_commits ||
+				    is_repository_shallow(r) ||
+				    repo_has_promisor_remote(r))
 					return 0;
 				return error(_("could not parse commit %s"),
 					     oid_to_hex(&p->object.oid));
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 00/12] The merge-base logic vs missing commit objects
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (11 preceding siblings ...)
  2024-02-13  8:41 ` [PATCH 12/12] paint_down_to_common(): special-case shallow/partial clones Johannes Schindelin via GitGitGadget
@ 2024-02-13 18:37 ` Junio C Hamano
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
  13 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2024-02-13 18:37 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> I am deeply sorry both about the length of this patch series as well as
> having to send it in the -rc phase.

You do not have to ;-), but thanks.

Also outside the scope of -rc, there were some untied loose ends in
the missing tree thing before we can move to a new topic, if I
recall correctly.

Speaking of -rc, the project would be helped by your expertise a lot
if you gave a quick Ack on the make_relative() thing [*].


[Reference]

 * f6628636 (unit-tests: do show relative file paths on non-Windows,
   too, 2024-02-11)
   https://lore.kernel.org/git/xmqqle7r9enn.fsf_-_@gitster.g/

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 02/12] Let `repo_in_merge_bases()` report missing commits
  2024-02-13  8:41 ` [PATCH 02/12] Let `repo_in_merge_bases()` report missing commits Johannes Schindelin via GitGitGadget
@ 2024-02-15  9:33   ` Patrick Steinhardt
  0 siblings, 0 replies; 70+ messages in thread
From: Patrick Steinhardt @ 2024-02-15  9:33 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin

[-- Attachment #1: Type: text/plain, Size: 11211 bytes --]

On Tue, Feb 13, 2024 at 08:41:38AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
[snip]
> diff --git a/http-push.c b/http-push.c
> index a704f490fdb..85fa2f457d4 100644
> --- a/http-push.c
> +++ b/http-push.c
> @@ -1576,8 +1576,11 @@ static int verify_merge_base(struct object_id *head_oid, struct ref *remote)
>  	struct commit *head = lookup_commit_or_die(head_oid, "HEAD");
>  	struct commit *branch = lookup_commit_or_die(&remote->old_oid,
>  						     remote->name);
> +	int i = repo_in_merge_bases(the_repository, branch, head);
>  
> -	return repo_in_merge_bases(the_repository, branch, head);
> +	if (i < 0)
> +		exit(128);
> +	return i;

Nit: it's a bit unusual that we use `i` here instead of `ret`. Not worth
a reroll on its own though.

>  }
>  
>  static int delete_remote_branch(const char *pattern, int force)
> diff --git a/merge-ort.c b/merge-ort.c
> index 6491070d965..64e76afe89f 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -544,6 +544,7 @@ enum conflict_and_info_types {
>  	CONFLICT_SUBMODULE_HISTORY_NOT_AVAILABLE,
>  	CONFLICT_SUBMODULE_MAY_HAVE_REWINDS,
>  	CONFLICT_SUBMODULE_NULL_MERGE_BASE,
> +	CONFLICT_SUBMODULE_CORRUPT,
>  
>  	/* Keep this entry _last_ in the list */
>  	NB_CONFLICT_TYPES,
> @@ -596,7 +597,9 @@ static const char *type_short_descriptions[] = {
>  	[CONFLICT_SUBMODULE_MAY_HAVE_REWINDS] =
>  		"CONFLICT (submodule may have rewinds)",
>  	[CONFLICT_SUBMODULE_NULL_MERGE_BASE] =
> -		"CONFLICT (submodule lacks merge base)"
> +		"CONFLICT (submodule lacks merge base)",
> +	[CONFLICT_SUBMODULE_CORRUPT] =
> +		"CONFLICT (submodule corrupt)"
>  };
>  
>  struct logical_conflict_info {
> @@ -1710,7 +1713,11 @@ static int find_first_merges(struct repository *repo,
>  		die("revision walk setup failed");
>  	while ((commit = get_revision(&revs)) != NULL) {
>  		struct object *o = &(commit->object);
> -		if (repo_in_merge_bases(repo, b, commit))
> +		int ret = repo_in_merge_bases(repo, b, commit);
> +
> +		if (ret < 0)
> +			return ret;

This is leaking both `merges` and `revs`.

> +		if (ret > 0)
>  			add_object_array(o, NULL, &merges);
>  	}
>  	reset_revision_walk();
> @@ -1725,9 +1732,14 @@ static int find_first_merges(struct repository *repo,
>  		contains_another = 0;
>  		for (j = 0; j < merges.nr; j++) {
>  			struct commit *m2 = (struct commit *) merges.objects[j].item;
> -			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
> -				contains_another = 1;
> -				break;
> +			if (i != j) {
> +				int ret = repo_in_merge_bases(repo, m2, m1);
> +				if (ret < 0)
> +					return ret;
> +				if (ret > 0) {
> +					contains_another = 1;
> +					break;
> +				}
>  			}
>  		}
>  
> @@ -1749,7 +1761,7 @@ static int merge_submodule(struct merge_options *opt,
>  {
>  	struct repository subrepo;
>  	struct strbuf sb = STRBUF_INIT;
> -	int ret = 0;
> +	int ret = 0, ret2;
>  	struct commit *commit_o, *commit_a, *commit_b;
>  	int parent_count;
>  	struct object_array merges;
> @@ -1796,8 +1808,26 @@ static int merge_submodule(struct merge_options *opt,
>  	}
>  
>  	/* check whether both changes are forward */
> -	if (!repo_in_merge_bases(&subrepo, commit_o, commit_a) ||
> -	    !repo_in_merge_bases(&subrepo, commit_o, commit_b)) {
> +	ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_a);
> +	if (ret2 < 1) {
> +		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
> +			 path, NULL, NULL, NULL,
> +			 _("Failed to merge submodule %s "
> +			   "(repository corrupt)"),
> +			 path);
> +		goto cleanup;
> +	}

Is `if (ret2 < 1)` intentional? I doubt it is, because it would also
trigger for `ret2 == 0`, which we try to explicitly handle in the next
line. So the following condition `if (!ret2)` cannot ever be true.

> +	if (!ret2)
> +		ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_b);
> +	if (ret2 < 1) {
> +		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
> +			 path, NULL, NULL, NULL,
> +			 _("Failed to merge submodule %s "
> +			   "(repository corrupt)"),
> +			 path);
> +		goto cleanup;
> +	}

Same here.

> +	if (!ret2) {
>  		path_msg(opt, CONFLICT_SUBMODULE_MAY_HAVE_REWINDS, 0,
>  			 path, NULL, NULL, NULL,
>  			 _("Failed to merge submodule %s "
> @@ -1807,7 +1837,16 @@ static int merge_submodule(struct merge_options *opt,
>  	}
>  
>  	/* Case #1: a is contained in b or vice versa */
> -	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
> +	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
> +	if (ret2 < 0) {
> +		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
> +			 path, NULL, NULL, NULL,
> +			 _("Failed to merge submodule %s "
> +			   "(repository corrupt)"),
> +			 path);
> +		goto cleanup;
> +	}
> +	if (ret2 > 0) {
>  		oidcpy(result, b);
>  		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
>  			 path, NULL, NULL, NULL,
> @@ -1816,7 +1855,16 @@ static int merge_submodule(struct merge_options *opt,
>  		ret = 1;
>  		goto cleanup;
>  	}
> -	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
> +	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
> +	if (ret2 < 0) {
> +		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
> +			 path, NULL, NULL, NULL,
> +			 _("Failed to merge submodule %s "
> +			   "(repository corrupt)"),
> +			 path);
> +		goto cleanup;
> +	}
> +	if (ret2 > 0) {
>  		oidcpy(result, a);
>  		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
>  			 path, NULL, NULL, NULL,
> @@ -1841,6 +1889,13 @@ static int merge_submodule(struct merge_options *opt,
>  	parent_count = find_first_merges(&subrepo, path, commit_a, commit_b,
>  					 &merges);
>  	switch (parent_count) {
> +	case -1:
> +		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
> +			 path, NULL, NULL, NULL,
> +			 _("Failed to merge submodule %s "
> +			   "(repository corrupt)"),
> +			 path);
> +		break;

I was wondering whether it is safe to `break` here because we do end up
calling `object_array_clear()` on `merges` which we didn't initialize in
this function. But `find_first_merges()` always initializes the array
with zeroes, so this is fine.

>  	case 0:
>  		path_msg(opt, CONFLICT_SUBMODULE_FAILED_TO_MERGE, 0,
>  			 path, NULL, NULL, NULL,
> diff --git a/merge-recursive.c b/merge-recursive.c
> index e3beb0801b1..e3fe7803cbe 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1144,7 +1144,10 @@ static int find_first_merges(struct repository *repo,
>  		die("revision walk setup failed");
>  	while ((commit = get_revision(&revs)) != NULL) {
>  		struct object *o = &(commit->object);
> -		if (repo_in_merge_bases(repo, b, commit))
> +		int ret = repo_in_merge_bases(repo, b, commit);
> +		if (ret < 0)
> +			return ret;

We also leak `merges` and `revs` here.

Patrick

> +		if (ret)
>  			add_object_array(o, NULL, &merges);
>  	}
>  	reset_revision_walk();
> @@ -1159,9 +1162,14 @@ static int find_first_merges(struct repository *repo,
>  		contains_another = 0;
>  		for (j = 0; j < merges.nr; j++) {
>  			struct commit *m2 = (struct commit *) merges.objects[j].item;
> -			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
> -				contains_another = 1;
> -				break;
> +			if (i != j) {
> +				int ret = repo_in_merge_bases(repo, m2, m1);
> +				if (ret < 0)
> +					return ret;
> +				if (ret > 0) {
> +					contains_another = 1;
> +					break;
> +				}
>  			}
>  		}
>  
> @@ -1197,7 +1205,7 @@ static int merge_submodule(struct merge_options *opt,
>  			   const struct object_id *b)
>  {
>  	struct repository subrepo;
> -	int ret = 0;
> +	int ret = 0, ret2;
>  	struct commit *commit_base, *commit_a, *commit_b;
>  	int parent_count;
>  	struct object_array merges;
> @@ -1234,14 +1242,29 @@ static int merge_submodule(struct merge_options *opt,
>  	}
>  
>  	/* check whether both changes are forward */
> -	if (!repo_in_merge_bases(&subrepo, commit_base, commit_a) ||
> -	    !repo_in_merge_bases(&subrepo, commit_base, commit_b)) {
> +	ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_a);
> +	if (ret2 < 0) {
> +		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
> +		goto cleanup;
> +	}
> +	if (ret2)
> +		ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_b);
> +	if (ret2 < 0) {
> +		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
> +		goto cleanup;
> +	}
> +	if (!ret2) {
>  		output(opt, 1, _("Failed to merge submodule %s (commits don't follow merge-base)"), path);
>  		goto cleanup;
>  	}
>  
>  	/* Case #1: a is contained in b or vice versa */
> -	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
> +	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
> +	if (ret2 < 0) {
> +		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
> +		goto cleanup;
> +	}
> +	if (ret2) {
>  		oidcpy(result, b);
>  		if (show(opt, 3)) {
>  			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
> @@ -1254,7 +1277,12 @@ static int merge_submodule(struct merge_options *opt,
>  		ret = 1;
>  		goto cleanup;
>  	}
> -	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
> +	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
> +	if (ret2 < 0) {
> +		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
> +		goto cleanup;
> +	}
> +	if (ret2) {
>  		oidcpy(result, a);
>  		if (show(opt, 3)) {
>  			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
> @@ -1402,6 +1430,8 @@ static int merge_mode_and_contents(struct merge_options *opt,
>  							&o->oid,
>  							&a->oid,
>  							&b->oid);
> +			if (result->clean < 0)
> +				return -1;
>  		} else if (S_ISLNK(a->mode)) {
>  			switch (opt->recursive_variant) {
>  			case MERGE_VARIANT_NORMAL:
> diff --git a/shallow.c b/shallow.c
> index ac728cdd778..cf4b95114b7 100644
> --- a/shallow.c
> +++ b/shallow.c
> @@ -795,12 +795,16 @@ static void post_assign_shallow(struct shallow_info *info,
>  		if (!*bitmap)
>  			continue;
>  		for (j = 0; j < bitmap_nr; j++)
> -			if (bitmap[0][j] &&
> -			    /* Step 7, reachability test at commit level */
> -			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits)) {
> -				update_refstatus(ref_status, info->ref->nr, *bitmap);
> -				dst++;
> -				break;
> +			if (bitmap[0][j]) {
> +				/* Step 7, reachability test at commit level */
> +				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits);
> +				if (ret < 0)
> +					exit(128);
> +				if (!ret) {
> +					update_refstatus(ref_status, info->ref->nr, *bitmap);
> +					dst++;
> +					break;
> +				}
>  			}
>  	}
>  	info->nr_ours = dst;
> @@ -829,6 +833,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
>  							    commit,
>  							    si->nr_commits,
>  							    si->commits);
> +		if (si->reachable[c] < 0)
> +			exit(128);
>  		si->need_reachability_test[c] = 0;
>  	}
>  	return si->reachable[c];
> -- 
> gitgitgadget
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v2 00/11] The merge-base logic vs missing commit objects
  2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                   ` (12 preceding siblings ...)
  2024-02-13 18:37 ` [PATCH 00/12] The merge-base logic vs missing commit objects Junio C Hamano
@ 2024-02-22 13:21 ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
                     ` (11 more replies)
  13 siblings, 12 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin

This patch series is in the same spirit as what I proposed in
https://lore.kernel.org/git/pull.1651.git.1707212981.gitgitgadget@gmail.com/,
where I tackled missing tree objects. This here patch series tackles missing
commit objects instead: Git's merge-base logic handles these missing commit
objects as if there had not been any commit object at all, i.e. if two
commits' merge base commit is missing, they will be treated as if they had
no common commit history at all, which is a bug. Those commit objects should
not be missing, of course, i.e. this is only a problem in corrupt
repositories.

This patch series is a bit more tricky than the "missing tree objects" one,
though: The function signature of quite a few functions need to be changed
to allow callers to discern between "missing commit object" vs "no
merge-base found".

And of course it gets even more tricky because in shallow and partial clones
we expect commit objects to be missing, and that must not be treated like an
error but the existing logic is actually desirable in those scenarios.

I am deeply sorry both about the length of this patch series as well as
having to lean so heavily on reviews on the Git mailing list.

Changes since v1:

 * Addressed a lot of memory leaks.
 * Reordered patch 2 and 3 to render the commit message's comment about
   unchanged behavior true.
 * Fixed the incorrectly converted condition in the merge_submodule()
   function.
 * The last patch ("paint_down_to_common(): special-case shallow/partial
   clones") was dropped because it is not actually necessary, and the
   explanation for that was added to the commit message of "Prepare
   paint_down_to_common() for handling shallow commits".
 * An inconsistently-named variable i was renamed to be consistent with the
   other variables called ret.

Johannes Schindelin (11):
  paint_down_to_common: plug two memory leaks
  Prepare `repo_in_merge_bases_many()` to optionally expect missing
    commits
  Start reporting missing commits in `repo_in_merge_bases_many()`
  Prepare `paint_down_to_common()` for handling shallow commits
  commit-reach: start reporting errors in `paint_down_to_common()`
  merge_bases_many(): pass on errors from `paint_down_to_common()`
  get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases(): pass on errors from `merge_bases_many()`
  get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases_many_dirty(): pass on errors from
    `merge_bases_many()`

 bisect.c                         |   7 +-
 builtin/branch.c                 |  12 +-
 builtin/fast-import.c            |   6 +-
 builtin/fetch.c                  |   2 +
 builtin/log.c                    |  30 +++--
 builtin/merge-base.c             |  23 +++-
 builtin/merge-tree.c             |   5 +-
 builtin/merge.c                  |  26 ++--
 builtin/pull.c                   |   9 +-
 builtin/rebase.c                 |   8 +-
 builtin/receive-pack.c           |   6 +-
 builtin/rev-parse.c              |   5 +-
 commit-reach.c                   | 209 ++++++++++++++++++++-----------
 commit-reach.h                   |  26 ++--
 commit.c                         |   7 +-
 diff-lib.c                       |   5 +-
 http-push.c                      |   5 +-
 log-tree.c                       |   5 +-
 merge-ort.c                      |  87 +++++++++++--
 merge-recursive.c                |  58 +++++++--
 notes-merge.c                    |   3 +-
 object-name.c                    |   7 +-
 remote.c                         |   2 +-
 revision.c                       |  12 +-
 sequencer.c                      |   8 +-
 shallow.c                        |  21 ++--
 submodule.c                      |   7 +-
 t/helper/test-reach.c            |  11 +-
 t/t4301-merge-tree-write-tree.sh |  12 ++
 29 files changed, 441 insertions(+), 183 deletions(-)


base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1657%2Fdscho%2Fmerge-base-and-missing-objects-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1657/dscho/merge-base-and-missing-objects-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1657

Range-diff vs v1:

  1:  818eef0a644 !  1:  6e4e409cd43 paint_down_to_common: plug a memory leak
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    paint_down_to_common: plug a memory leak
     +    paint_down_to_common: plug two memory leaks
      
          When a commit is missing, we return early (currently pretending that no
          merge basis could be found in that case). At that stage, it is possible
          that a merge base could have been found already, and added to the
          `result`, which is now leaked.
      
     -    Let's release it, to address that memory leak.
     +    The priority queue has a similar issue: There might still be a commit in
     +    that queue.
     +
     +    Let's release both, to address the potential memory leaks.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
     @@ commit-reach.c: static struct commit_list *paint_down_to_common(struct repositor
       				continue;
      -			if (repo_parse_commit(r, p))
      +			if (repo_parse_commit(r, p)) {
     ++				clear_prio_queue(&queue);
      +				free_commit_list(result);
       				return NULL;
      +			}
  3:  8395c3efbc3 !  2:  5c16becfb9b Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
     @@ commit-reach.c: int repo_is_descendant_of(struct repository *r,
       
       			other = with_commit->item;
       			with_commit = with_commit->next;
     --			ret = repo_in_merge_bases_many(r, other, 1, &commit);
     -+			ret = repo_in_merge_bases_many(r, other, 1, &commit, 0);
     - 			if (ret)
     - 				return ret;
     +-			if (repo_in_merge_bases_many(r, other, 1, &commit))
     ++			if (repo_in_merge_bases_many(r, other, 1, &commit, 0))
     + 				return 1;
       		}
     + 		return 0;
      @@ commit-reach.c: int repo_is_descendant_of(struct repository *r,
        * Is "commit" an ancestor of one of the "references"?
        */
     @@ commit-reach.c: int repo_is_descendant_of(struct repository *r,
       {
       	struct commit_list *bases;
       	int ret = 0, i;
     - 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
     - 
     - 	if (repo_parse_commit(r, commit))
     --		return -1;
     -+		return ignore_missing_commits ? 0 : -1;
     - 	for (i = 0; i < nr_reference; i++) {
     - 		if (repo_parse_commit(r, reference[i]))
     --			return -1;
     -+			return ignore_missing_commits ? 0 : -1;
     - 
     - 		generation = commit_graph_generation(reference[i]);
     - 		if (generation > max_generation)
      
       ## commit-reach.h ##
      @@ commit-reach.h: int repo_in_merge_bases(struct repository *r,
     @@ remote.c: static int is_reachable_in_reflog(const char *local, const struct ref
       ## shallow.c ##
      @@ shallow.c: static void post_assign_shallow(struct shallow_info *info,
       		for (j = 0; j < bitmap_nr; j++)
     - 			if (bitmap[0][j]) {
     - 				/* Step 7, reachability test at commit level */
     --				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits);
     -+				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1);
     - 				if (ret < 0)
     - 					exit(128);
     - 				if (!ret) {
     + 			if (bitmap[0][j] &&
     + 			    /* Step 7, reachability test at commit level */
     +-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits)) {
     ++			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1)) {
     + 				update_refstatus(ref_status, info->ref->nr, *bitmap);
     + 				dst++;
     + 				break;
      @@ shallow.c: int delayed_reachability_test(struct shallow_info *si, int c)
       		si->reachable[c] = repo_in_merge_bases_many(the_repository,
       							    commit,
     @@ shallow.c: int delayed_reachability_test(struct shallow_info *si, int c)
      -							    si->commits);
      +							    si->commits,
      +							    1);
     - 		if (si->reachable[c] < 0)
     - 			exit(128);
       		si->need_reachability_test[c] = 0;
     + 	}
     + 	return si->reachable[c];
      
       ## t/helper/test-reach.c ##
      @@ t/helper/test-reach.c: int cmd__reach(int ac, const char **av)
  2:  5f0af7fc0b9 !  3:  4dd214f91d4 Let `repo_in_merge_bases()` report missing commits
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    Let `repo_in_merge_bases()` report missing commits
     +    Start reporting missing commits in `repo_in_merge_bases_many()`
      
          Some functions in Git's source code follow the convention that returning
          a negative value indicates a fatal error, e.g. repository corruption.
     @@ Commit message
          Also adjust the callers of `repo_in_merge_bases()` to handle such
          negative return values.
      
     +    Note: As of this patch, errors are returned only if any of the specified
     +    merge heads is missing. Over the course of the next patches, missing
     +    commits will also be reported by the `paint_down_to_common()` function,
     +    which is called by `repo_in_merge_bases_many()`, and those errors will
     +    be properly propagated back to the caller at that stage.
     +
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## builtin/branch.c ##
     @@ commit-reach.c: int repo_is_descendant_of(struct repository *r,
       
       			other = with_commit->item;
       			with_commit = with_commit->next;
     --			if (repo_in_merge_bases_many(r, other, 1, &commit))
     +-			if (repo_in_merge_bases_many(r, other, 1, &commit, 0))
      -				return 1;
     -+			ret = repo_in_merge_bases_many(r, other, 1, &commit);
     ++			ret = repo_in_merge_bases_many(r, other, 1, &commit, 0);
      +			if (ret)
      +				return ret;
       		}
     @@ commit-reach.c: int repo_in_merge_bases_many(struct repository *r, struct commit
       
       	if (repo_parse_commit(r, commit))
      -		return ret;
     -+		return -1;
     ++		return ignore_missing_commits ? 0 : -1;
       	for (i = 0; i < nr_reference; i++) {
       		if (repo_parse_commit(r, reference[i]))
      -			return ret;
     -+			return -1;
     ++			return ignore_missing_commits ? 0 : -1;
       
       		generation = commit_graph_generation(reference[i]);
       		if (generation > max_generation)
     @@ http-push.c: static int verify_merge_base(struct object_id *head_oid, struct ref
       	struct commit *head = lookup_commit_or_die(head_oid, "HEAD");
       	struct commit *branch = lookup_commit_or_die(&remote->old_oid,
       						     remote->name);
     -+	int i = repo_in_merge_bases(the_repository, branch, head);
     ++	int ret = repo_in_merge_bases(the_repository, branch, head);
       
      -	return repo_in_merge_bases(the_repository, branch, head);
     -+	if (i < 0)
     ++	if (ret < 0)
      +		exit(128);
     -+	return i;
     ++	return ret;
       }
       
       static int delete_remote_branch(const char *pattern, int force)
     @@ merge-ort.c: static int find_first_merges(struct repository *repo,
      -		if (repo_in_merge_bases(repo, b, commit))
      +		int ret = repo_in_merge_bases(repo, b, commit);
      +
     -+		if (ret < 0)
     ++		if (ret < 0) {
     ++			object_array_clear(&merges);
     ++			release_revisions(&revs);
      +			return ret;
     ++		}
      +		if (ret > 0)
       			add_object_array(o, NULL, &merges);
       	}
     @@ merge-ort.c: static int find_first_merges(struct repository *repo,
      -				break;
      +			if (i != j) {
      +				int ret = repo_in_merge_bases(repo, m2, m1);
     -+				if (ret < 0)
     ++				if (ret < 0) {
     ++					object_array_clear(&merges);
     ++					release_revisions(&revs);
      +					return ret;
     ++				}
      +				if (ret > 0) {
      +					contains_another = 1;
      +					break;
     @@ merge-ort.c: static int merge_submodule(struct merge_options *opt,
      -	if (!repo_in_merge_bases(&subrepo, commit_o, commit_a) ||
      -	    !repo_in_merge_bases(&subrepo, commit_o, commit_b)) {
      +	ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_a);
     -+	if (ret2 < 1) {
     ++	if (ret2 < 0) {
      +		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
      +			 path, NULL, NULL, NULL,
      +			 _("Failed to merge submodule %s "
     @@ merge-ort.c: static int merge_submodule(struct merge_options *opt,
      +			 path);
      +		goto cleanup;
      +	}
     -+	if (!ret2)
     ++	if (ret2 > 0)
      +		ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_b);
     -+	if (ret2 < 1) {
     ++	if (ret2 < 0) {
      +		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
      +			 path, NULL, NULL, NULL,
      +			 _("Failed to merge submodule %s "
     @@ merge-recursive.c: static int find_first_merges(struct repository *repo,
       		struct object *o = &(commit->object);
      -		if (repo_in_merge_bases(repo, b, commit))
      +		int ret = repo_in_merge_bases(repo, b, commit);
     -+		if (ret < 0)
     ++		if (ret < 0) {
     ++			object_array_clear(&merges);
     ++			release_revisions(&revs);
      +			return ret;
     ++		}
      +		if (ret)
       			add_object_array(o, NULL, &merges);
       	}
     @@ merge-recursive.c: static int find_first_merges(struct repository *repo,
      -				break;
      +			if (i != j) {
      +				int ret = repo_in_merge_bases(repo, m2, m1);
     -+				if (ret < 0)
     ++				if (ret < 0) {
     ++					object_array_clear(&merges);
     ++					release_revisions(&revs);
      +					return ret;
     ++				}
      +				if (ret > 0) {
      +					contains_another = 1;
      +					break;
     @@ merge-recursive.c: static int merge_submodule(struct merge_options *opt,
      +		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
      +		goto cleanup;
      +	}
     -+	if (ret2)
     ++	if (ret2 > 0)
      +		ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_b);
      +	if (ret2 < 0) {
      +		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
     @@ shallow.c: static void post_assign_shallow(struct shallow_info *info,
       		for (j = 0; j < bitmap_nr; j++)
      -			if (bitmap[0][j] &&
      -			    /* Step 7, reachability test at commit level */
     --			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits)) {
     +-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1)) {
      -				update_refstatus(ref_status, info->ref->nr, *bitmap);
      -				dst++;
      -				break;
      +			if (bitmap[0][j]) {
      +				/* Step 7, reachability test at commit level */
     -+				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits);
     ++				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1);
      +				if (ret < 0)
      +					exit(128);
      +				if (!ret) {
     @@ shallow.c: static void post_assign_shallow(struct shallow_info *info,
       	}
       	info->nr_ours = dst;
      @@ shallow.c: int delayed_reachability_test(struct shallow_info *si, int c)
     - 							    commit,
       							    si->nr_commits,
     - 							    si->commits);
     + 							    si->commits,
     + 							    1);
      +		if (si->reachable[c] < 0)
      +			exit(128);
       		si->need_reachability_test[c] = 0;
  4:  7477b18adeb !  4:  53bdeddb4cb Prepare `paint_down_to_common()` for handling shallow commits
     @@ Commit message
          able to handle this situation gracefully.
      
          Currently, that logic pretends that a missing parent commit is
     -    equivalent to a a missing parent commit, and for the purpose of
     +    equivalent to a missing parent commit, and for the purpose of
          `--update-shallow` that is exactly what we need it to do.
      
          Therefore, let's introduce a flag to indicate when that is precisely the
     @@ Commit message
          shallow root with receive.updateshallow on") and t5538.4 ("add new
          shallow root with receive.updateshallow on").
      
     +    Note: shallow commits' parents are set to `NULL` internally already,
     +    therefore there is no need to special-case shallow repositories here, as
     +    the merge-base logic will not try to access parent commits of shallow
     +    commits.
     +
     +    Likewise, partial clones aren't an issue either: If a commit is missing
     +    during the revision walk in the merge-base logic, it is fetched via
     +    `promisor_remote_get_direct()`. And not only the single missing commit
     +    object: Due to the way the "promised" objects are fetched (in
     +    `fetch_objects()` in `promisor-remote.c`, using `fetch
     +    --filter=blob:none`), there is no actual way to fetch a single commit
     +    object, as the remote side will pass that commit OID to `pack-objects
     +    --revs [...]` which in turn passes it to `rev-list` which interprets
     +    this as a commit _range_ instead of a single object. Therefore, in
     +    partial clones (unless they are shallow in addition), all commits
     +    reachable from a commit that is in the local object database are also
     +    present in that local database.
     +
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
       ## commit-reach.c ##
     @@ commit-reach.c: static int queue_has_nonstale(struct prio_queue *queue)
       	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
       	struct commit_list *result = NULL;
      @@ commit-reach.c: static struct commit_list *paint_down_to_common(struct repository *r,
     - 			if ((p->object.flags & flags) == flags)
     - 				continue;
       			if (repo_parse_commit(r, p)) {
     + 				clear_prio_queue(&queue);
     + 				free_commit_list(result);
      +				/*
      +				 * At this stage, we know that the commit is
      +				 * missing: `repo_parse_commit()` uses
     @@ commit-reach.c: static struct commit_list *paint_down_to_common(struct repositor
      +				 * corrupt commits would already have been
      +				 * dispatched with a `die()`.
      +				 */
     - 				free_commit_list(result);
       				return NULL;
       			}
     + 			p->object.flags |= flags;
      @@ commit-reach.c: static struct commit_list *merge_bases_many(struct repository *r,
       			return NULL;
       	}
  5:  f8aedb89f1c !  5:  ec3ebf0ed17 commit-reach: start reporting errors in `paint_down_to_common()`
     @@ commit-reach.c: static struct commit_list *paint_down_to_common(struct repositor
       			}
       			/* Mark parents of a found merge stale */
       			flags |= STALE;
     +@@ commit-reach.c: static struct commit_list *paint_down_to_common(struct repository *r,
     + 				continue;
     + 			if (repo_parse_commit(r, p)) {
     + 				clear_prio_queue(&queue);
     +-				free_commit_list(result);
     ++				free_commit_list(*result);
     ++				*result = NULL;
     + 				/*
     + 				 * At this stage, we know that the commit is
     + 				 * missing: `repo_parse_commit()` uses
      @@ commit-reach.c: static struct commit_list *paint_down_to_common(struct repository *r,
       				 * corrupt commits would already have been
       				 * dispatched with a `die()`.
       				 */
     --				free_commit_list(result);
      -				return NULL;
     -+				free_commit_list(*result);
      +				if (ignore_missing_commits)
      +					return 0;
      +				return error(_("could not parse commit %s"),
     @@ commit-reach.c: static struct commit_list *merge_bases_many(struct repository *r
       	}
       
      -	list = paint_down_to_common(r, one, n, twos, 0, 0);
     -+	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0)
     ++	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
     ++		free_commit_list(list);
      +		return NULL;
     ++	}
       
       	while (list) {
       		struct commit *commit = pop_commit(&list);
     @@ commit-reach.c: static int remove_redundant_no_gen(struct repository *r,
      -		common = paint_down_to_common(r, array[i], filled,
      -					      work, min_generation, 0);
      +		if (paint_down_to_common(r, array[i], filled,
     -+					 work, min_generation, 0, &common))
     ++					 work, min_generation, 0, &common)) {
     ++			clear_commit_marks(array[i], all_flags);
     ++			clear_commit_marks_many(filled, work, all_flags);
     ++			free_commit_list(common);
     ++			free(work);
     ++			free(redundant);
     ++			free(filled_index);
      +			return -1;
     ++		}
       		if (array[i]->object.flags & PARENT2)
       			redundant[i] = 1;
       		for (j = 0; j < filled; j++)
  6:  0aca08a2cb5 !  6:  05756fbf71a merge_bases_many(): pass on errors from `paint_down_to_common()`
     @@ commit-reach.c: static int paint_down_to_common(struct repository *r,
      +				     oid_to_hex(&twos[i]->object.oid));
       	}
       
     - 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0)
     + 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
     + 		free_commit_list(list);
      -		return NULL;
      +		return -1;
     + 	}
       
       	while (list) {
       		struct commit *commit = pop_commit(&list);
  7:  03734cc09da =  7:  e3d37a326e5 get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
  8:  43c33e2eac4 !  8:  9ca504525b9 repo_get_merge_bases(): pass on errors from `merge_bases_many()`
     @@ object-name.c: int repo_get_oid_mb(struct repository *r,
       	if (!two)
       		return -1;
      -	mbs = repo_get_merge_bases(r, one, two);
     -+	if (repo_get_merge_bases(r, one, two, &mbs) < 0)
     ++	if (repo_get_merge_bases(r, one, two, &mbs) < 0) {
     ++		free_commit_list(mbs);
      +		return -1;
     ++	}
       	if (!mbs || mbs->next)
       		st = -1;
       	else {
     @@ revision.c: static int handle_dotdot_1(const char *arg, char *dotdot,
       			return dotdot_missing(arg, dotdot, revs, symmetric);
       
      -		exclude = repo_get_merge_bases(the_repository, a, b);
     -+		if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0)
     ++		if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0) {
     ++			free_commit_list(exclude);
      +			return -1;
     ++		}
       		add_rev_cmdline_list(revs, exclude, REV_CMD_MERGE_BASE,
       				     flags_exclude);
       		add_pending_commit_list(revs, exclude, flags_exclude);
  9:  7fbf660e371 !  9:  b11879edb73 get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
     @@ builtin/merge-base.c: static int handle_independent(int count, const char **args
       		commit_list_insert(get_commit_reference(args[i]), &revs);
       
      -	result = get_octopus_merge_bases(revs);
     -+	if (get_octopus_merge_bases(revs, &result) < 0)
     ++	if (get_octopus_merge_bases(revs, &result) < 0) {
     ++		free_commit_list(revs);
     ++		free_commit_list(result);
      +		return 128;
     ++	}
       	free_commit_list(revs);
       	reduce_heads_replace(&result);
       
 10:  55757c3a35d = 10:  602a7383f72 repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
 11:  e7fcc96196c ! 11:  96850ed2d69 repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`
     @@ builtin/merge-base.c
      -	result = repo_get_merge_bases_many_dirty(the_repository, rev[0],
      -						 rev_nr - 1, rev + 1);
      +	if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
     -+					    rev_nr - 1, rev + 1, &result) < 0)
     ++					    rev_nr - 1, rev + 1, &result) < 0) {
     ++		free_commit_list(result);
      +		return -1;
     ++	}
       
       	if (!result)
       		return 1;
 12:  33894600ae7 <  -:  ----------- paint_down_to_common(): special-case shallow/partial clones

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v2 01/11] paint_down_to_common: plug two memory leaks
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits Johannes Schindelin via GitGitGadget
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When a commit is missing, we return early (currently pretending that no
merge basis could be found in that case). At that stage, it is possible
that a merge base could have been found already, and added to the
`result`, which is now leaked.

The priority queue has a similar issue: There might still be a commit in
that queue.

Let's release both, to address the potential memory leaks.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/commit-reach.c b/commit-reach.c
index a868a575ea1..7ea916f9ebd 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -105,8 +105,11 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 			parents = parents->next;
 			if ((p->object.flags & flags) == flags)
 				continue;
-			if (repo_parse_commit(r, p))
+			if (repo_parse_commit(r, p)) {
+				clear_prio_queue(&queue);
+				free_commit_list(result);
 				return NULL;
+			}
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
 		}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Currently this function treats unrelated commit histories the same way
as commit histories with missing commit objects.

Typically, missing commit objects constitute a corrupt repository,
though, and should be reported as such. The next commits will make it
so, but there is one exception: In `git fetch --update-shallow` we
_expect_ commit objects to be missing, and we do want to treat the
now-incomplete commit histories as unrelated.

To allow for that, let's introduce an additional parameter that is
passed to `repo_in_merge_bases_many()` to trigger this behavior, and use
it in the two callers in `shallow.c`.

This does not change behavior in this commit, but prepares in an
easy-to-review way for the upcoming changes that will make the merge
base logic more stringent with regards to missing commit objects.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c        | 5 +++--
 commit-reach.h        | 3 ++-
 remote.c              | 2 +-
 shallow.c             | 5 +++--
 t/helper/test-reach.c | 2 +-
 5 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 7ea916f9ebd..ba2df8b8f3a 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -467,7 +467,7 @@ int repo_is_descendant_of(struct repository *r,
 
 			other = with_commit->item;
 			with_commit = with_commit->next;
-			if (repo_in_merge_bases_many(r, other, 1, &commit))
+			if (repo_in_merge_bases_many(r, other, 1, &commit, 0))
 				return 1;
 		}
 		return 0;
@@ -478,7 +478,8 @@ int repo_is_descendant_of(struct repository *r,
  * Is "commit" an ancestor of one of the "references"?
  */
 int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
-			     int nr_reference, struct commit **reference)
+			     int nr_reference, struct commit **reference,
+			     int ignore_missing_commits)
 {
 	struct commit_list *bases;
 	int ret = 0, i;
diff --git a/commit-reach.h b/commit-reach.h
index 35c4da49481..68f81549a44 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -30,7 +30,8 @@ int repo_in_merge_bases(struct repository *r,
 			struct commit *reference);
 int repo_in_merge_bases_many(struct repository *r,
 			     struct commit *commit,
-			     int nr_reference, struct commit **reference);
+			     int nr_reference, struct commit **reference,
+			     int ignore_missing_commits);
 
 /*
  * Takes a list of commits and returns a new list where those
diff --git a/remote.c b/remote.c
index abb24822beb..763c80f4a7d 100644
--- a/remote.c
+++ b/remote.c
@@ -2675,7 +2675,7 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
 		if (MERGE_BASES_BATCH_SIZE < size)
 			size = MERGE_BASES_BATCH_SIZE;
 
-		if ((ret = repo_in_merge_bases_many(the_repository, commit, size, chunk)))
+		if ((ret = repo_in_merge_bases_many(the_repository, commit, size, chunk, 0)))
 			break;
 	}
 
diff --git a/shallow.c b/shallow.c
index ac728cdd778..dfcc1f86a7f 100644
--- a/shallow.c
+++ b/shallow.c
@@ -797,7 +797,7 @@ static void post_assign_shallow(struct shallow_info *info,
 		for (j = 0; j < bitmap_nr; j++)
 			if (bitmap[0][j] &&
 			    /* Step 7, reachability test at commit level */
-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits)) {
+			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1)) {
 				update_refstatus(ref_status, info->ref->nr, *bitmap);
 				dst++;
 				break;
@@ -828,7 +828,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 		si->reachable[c] = repo_in_merge_bases_many(the_repository,
 							    commit,
 							    si->nr_commits,
-							    si->commits);
+							    si->commits,
+							    1);
 		si->need_reachability_test[c] = 0;
 	}
 	return si->reachable[c];
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index 3e173399a00..aa816e168ea 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -113,7 +113,7 @@ int cmd__reach(int ac, const char **av)
 		       repo_in_merge_bases(the_repository, A, B));
 	else if (!strcmp(av[1], "in_merge_bases_many"))
 		printf("%s(A,X):%d\n", av[1],
-		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array));
+		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array, 0));
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()`
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-24  0:33     ` Junio C Hamano
  2024-03-01  6:56     ` Jeff King
  2024-02-22 13:21   ` [PATCH v2 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
                     ` (8 subsequent siblings)
  11 siblings, 2 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Some functions in Git's source code follow the convention that returning
a negative value indicates a fatal error, e.g. repository corruption.

Let's use this convention in `repo_in_merge_bases()` to report when one
of the specified commits is missing (i.e. when `repo_parse_commit()`
reports an error).

Also adjust the callers of `repo_in_merge_bases()` to handle such
negative return values.

Note: As of this patch, errors are returned only if any of the specified
merge heads is missing. Over the course of the next patches, missing
commits will also be reported by the `paint_down_to_common()` function,
which is called by `repo_in_merge_bases_many()`, and those errors will
be properly propagated back to the caller at that stage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/branch.c       | 12 +++++--
 builtin/fast-import.c  |  6 +++-
 builtin/fetch.c        |  2 ++
 builtin/log.c          |  7 ++--
 builtin/merge-base.c   |  6 +++-
 builtin/pull.c         |  4 +++
 builtin/receive-pack.c |  6 +++-
 commit-reach.c         | 12 ++++---
 http-push.c            |  5 ++-
 merge-ort.c            | 81 ++++++++++++++++++++++++++++++++++++------
 merge-recursive.c      | 54 +++++++++++++++++++++++-----
 shallow.c              | 18 ++++++----
 12 files changed, 175 insertions(+), 38 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index e7ee9bd0f15..7f9e79237f3 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -161,6 +161,8 @@ static int branch_merged(int kind, const char *name,
 
 	merged = reference_rev ? repo_in_merge_bases(the_repository, rev,
 						     reference_rev) : 0;
+	if (merged < 0)
+		exit(128);
 
 	/*
 	 * After the safety valve is fully redefined to "check with
@@ -169,9 +171,13 @@ static int branch_merged(int kind, const char *name,
 	 * any of the following code, but during the transition period,
 	 * a gentle reminder is in order.
 	 */
-	if ((head_rev != reference_rev) &&
-	    (head_rev ? repo_in_merge_bases(the_repository, rev, head_rev) : 0) != merged) {
-		if (merged)
+	if (head_rev != reference_rev) {
+		int expect = head_rev ? repo_in_merge_bases(the_repository, rev, head_rev) : 0;
+		if (expect < 0)
+			exit(128);
+		if (expect == merged)
+			; /* okay */
+		else if (merged)
 			warning(_("deleting branch '%s' that has been merged to\n"
 				"         '%s', but not yet merged to HEAD"),
 				name, reference_name);
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 444f41cf8ca..14c2efa88fc 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -1625,6 +1625,7 @@ static int update_branch(struct branch *b)
 		oidclr(&old_oid);
 	if (!force_update && !is_null_oid(&old_oid)) {
 		struct commit *old_cmit, *new_cmit;
+		int ret;
 
 		old_cmit = lookup_commit_reference_gently(the_repository,
 							  &old_oid, 0);
@@ -1633,7 +1634,10 @@ static int update_branch(struct branch *b)
 		if (!old_cmit || !new_cmit)
 			return error("Branch %s is missing commits.", b->name);
 
-		if (!repo_in_merge_bases(the_repository, old_cmit, new_cmit)) {
+		ret = repo_in_merge_bases(the_repository, old_cmit, new_cmit);
+		if (ret < 0)
+			exit(128);
+		if (!ret) {
 			warning("Not updating %s"
 				" (new tip %s does not contain %s)",
 				b->name, oid_to_hex(&b->oid),
diff --git a/builtin/fetch.c b/builtin/fetch.c
index fd134ba74d9..0584a1f8b64 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -978,6 +978,8 @@ static int update_local_ref(struct ref *ref,
 		uint64_t t_before = getnanotime();
 		fast_forward = repo_in_merge_bases(the_repository, current,
 						   updated);
+		if (fast_forward < 0)
+			exit(128);
 		forced_updates_ms += (getnanotime() - t_before) / 1000000;
 	} else {
 		fast_forward = 1;
diff --git a/builtin/log.c b/builtin/log.c
index ba775d7b5cf..1705da71aca 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1623,7 +1623,7 @@ static struct commit *get_base_commit(const char *base_commit,
 {
 	struct commit *base = NULL;
 	struct commit **rev;
-	int i = 0, rev_nr = 0, auto_select, die_on_failure;
+	int i = 0, rev_nr = 0, auto_select, die_on_failure, ret;
 
 	switch (auto_base) {
 	case AUTO_BASE_NEVER:
@@ -1723,7 +1723,10 @@ static struct commit *get_base_commit(const char *base_commit,
 		rev_nr = DIV_ROUND_UP(rev_nr, 2);
 	}
 
-	if (!repo_in_merge_bases(the_repository, base, rev[0])) {
+	ret = repo_in_merge_bases(the_repository, base, rev[0]);
+	if (ret < 0)
+		exit(128);
+	if (!ret) {
 		if (die_on_failure) {
 			die(_("base commit should be the ancestor of revision list"));
 		} else {
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index e68b7fe45d7..0308fd73289 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -103,12 +103,16 @@ static int handle_octopus(int count, const char **args, int show_all)
 static int handle_is_ancestor(int argc, const char **argv)
 {
 	struct commit *one, *two;
+	int ret;
 
 	if (argc != 2)
 		die("--is-ancestor takes exactly two commits");
 	one = get_commit_reference(argv[0]);
 	two = get_commit_reference(argv[1]);
-	if (repo_in_merge_bases(the_repository, one, two))
+	ret = repo_in_merge_bases(the_repository, one, two);
+	if (ret < 0)
+		exit(128);
+	if (ret)
 		return 0;
 	else
 		return 1;
diff --git a/builtin/pull.c b/builtin/pull.c
index be2b2c9ebc9..e6f2942c0c5 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -931,6 +931,8 @@ static int get_can_ff(struct object_id *orig_head,
 	merge_head = lookup_commit_reference(the_repository, orig_merge_head);
 	ret = repo_is_descendant_of(the_repository, merge_head, list);
 	free_commit_list(list);
+	if (ret < 0)
+		exit(128);
 	return ret;
 }
 
@@ -955,6 +957,8 @@ static int already_up_to_date(struct object_id *orig_head,
 		commit_list_insert(theirs, &list);
 		ok = repo_is_descendant_of(the_repository, ours, list);
 		free_commit_list(list);
+		if (ok < 0)
+			exit(128);
 		if (!ok)
 			return 0;
 	}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8c4f0cb90a9..956fea6293e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1546,6 +1546,7 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 	    starts_with(name, "refs/heads/")) {
 		struct object *old_object, *new_object;
 		struct commit *old_commit, *new_commit;
+		int ret2;
 
 		old_object = parse_object(the_repository, old_oid);
 		new_object = parse_object(the_repository, new_oid);
@@ -1559,7 +1560,10 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 		}
 		old_commit = (struct commit *)old_object;
 		new_commit = (struct commit *)new_object;
-		if (!repo_in_merge_bases(the_repository, old_commit, new_commit)) {
+		ret2 = repo_in_merge_bases(the_repository, old_commit, new_commit);
+		if (ret2 < 0)
+			exit(128);
+		if (!ret2) {
 			rp_error("denying non-fast-forward %s"
 				 " (you should pull first)", name);
 			ret = "non-fast-forward";
diff --git a/commit-reach.c b/commit-reach.c
index ba2df8b8f3a..5ff71d72d51 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -464,11 +464,13 @@ int repo_is_descendant_of(struct repository *r,
 	} else {
 		while (with_commit) {
 			struct commit *other;
+			int ret;
 
 			other = with_commit->item;
 			with_commit = with_commit->next;
-			if (repo_in_merge_bases_many(r, other, 1, &commit, 0))
-				return 1;
+			ret = repo_in_merge_bases_many(r, other, 1, &commit, 0);
+			if (ret)
+				return ret;
 		}
 		return 0;
 	}
@@ -486,10 +488,10 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
 	if (repo_parse_commit(r, commit))
-		return ret;
+		return ignore_missing_commits ? 0 : -1;
 	for (i = 0; i < nr_reference; i++) {
 		if (repo_parse_commit(r, reference[i]))
-			return ret;
+			return ignore_missing_commits ? 0 : -1;
 
 		generation = commit_graph_generation(reference[i]);
 		if (generation > max_generation)
@@ -598,6 +600,8 @@ int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid)
 	commit_list_insert(old_commit, &old_commit_list);
 	ret = repo_is_descendant_of(the_repository,
 				    new_commit, old_commit_list);
+	if (ret < 0)
+		exit(128);
 	free_commit_list(old_commit_list);
 	return ret;
 }
diff --git a/http-push.c b/http-push.c
index a704f490fdb..24c16a4f5ff 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1576,8 +1576,11 @@ static int verify_merge_base(struct object_id *head_oid, struct ref *remote)
 	struct commit *head = lookup_commit_or_die(head_oid, "HEAD");
 	struct commit *branch = lookup_commit_or_die(&remote->old_oid,
 						     remote->name);
+	int ret = repo_in_merge_bases(the_repository, branch, head);
 
-	return repo_in_merge_bases(the_repository, branch, head);
+	if (ret < 0)
+		exit(128);
+	return ret;
 }
 
 static int delete_remote_branch(const char *pattern, int force)
diff --git a/merge-ort.c b/merge-ort.c
index 6491070d965..9f3af46333a 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -544,6 +544,7 @@ enum conflict_and_info_types {
 	CONFLICT_SUBMODULE_HISTORY_NOT_AVAILABLE,
 	CONFLICT_SUBMODULE_MAY_HAVE_REWINDS,
 	CONFLICT_SUBMODULE_NULL_MERGE_BASE,
+	CONFLICT_SUBMODULE_CORRUPT,
 
 	/* Keep this entry _last_ in the list */
 	NB_CONFLICT_TYPES,
@@ -596,7 +597,9 @@ static const char *type_short_descriptions[] = {
 	[CONFLICT_SUBMODULE_MAY_HAVE_REWINDS] =
 		"CONFLICT (submodule may have rewinds)",
 	[CONFLICT_SUBMODULE_NULL_MERGE_BASE] =
-		"CONFLICT (submodule lacks merge base)"
+		"CONFLICT (submodule lacks merge base)",
+	[CONFLICT_SUBMODULE_CORRUPT] =
+		"CONFLICT (submodule corrupt)"
 };
 
 struct logical_conflict_info {
@@ -1710,7 +1713,14 @@ static int find_first_merges(struct repository *repo,
 		die("revision walk setup failed");
 	while ((commit = get_revision(&revs)) != NULL) {
 		struct object *o = &(commit->object);
-		if (repo_in_merge_bases(repo, b, commit))
+		int ret = repo_in_merge_bases(repo, b, commit);
+
+		if (ret < 0) {
+			object_array_clear(&merges);
+			release_revisions(&revs);
+			return ret;
+		}
+		if (ret > 0)
 			add_object_array(o, NULL, &merges);
 	}
 	reset_revision_walk();
@@ -1725,9 +1735,17 @@ static int find_first_merges(struct repository *repo,
 		contains_another = 0;
 		for (j = 0; j < merges.nr; j++) {
 			struct commit *m2 = (struct commit *) merges.objects[j].item;
-			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
-				contains_another = 1;
-				break;
+			if (i != j) {
+				int ret = repo_in_merge_bases(repo, m2, m1);
+				if (ret < 0) {
+					object_array_clear(&merges);
+					release_revisions(&revs);
+					return ret;
+				}
+				if (ret > 0) {
+					contains_another = 1;
+					break;
+				}
 			}
 		}
 
@@ -1749,7 +1767,7 @@ static int merge_submodule(struct merge_options *opt,
 {
 	struct repository subrepo;
 	struct strbuf sb = STRBUF_INIT;
-	int ret = 0;
+	int ret = 0, ret2;
 	struct commit *commit_o, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
@@ -1796,8 +1814,26 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* check whether both changes are forward */
-	if (!repo_in_merge_bases(&subrepo, commit_o, commit_a) ||
-	    !repo_in_merge_bases(&subrepo, commit_o, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_a);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0)
+		ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_b);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (!ret2) {
 		path_msg(opt, CONFLICT_SUBMODULE_MAY_HAVE_REWINDS, 0,
 			 path, NULL, NULL, NULL,
 			 _("Failed to merge submodule %s "
@@ -1807,7 +1843,16 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* Case #1: a is contained in b or vice versa */
-	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0) {
 		oidcpy(result, b);
 		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
 			 path, NULL, NULL, NULL,
@@ -1816,7 +1861,16 @@ static int merge_submodule(struct merge_options *opt,
 		ret = 1;
 		goto cleanup;
 	}
-	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0) {
 		oidcpy(result, a);
 		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
 			 path, NULL, NULL, NULL,
@@ -1841,6 +1895,13 @@ static int merge_submodule(struct merge_options *opt,
 	parent_count = find_first_merges(&subrepo, path, commit_a, commit_b,
 					 &merges);
 	switch (parent_count) {
+	case -1:
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		break;
 	case 0:
 		path_msg(opt, CONFLICT_SUBMODULE_FAILED_TO_MERGE, 0,
 			 path, NULL, NULL, NULL,
diff --git a/merge-recursive.c b/merge-recursive.c
index e3beb0801b1..0d931cc14ad 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1144,7 +1144,13 @@ static int find_first_merges(struct repository *repo,
 		die("revision walk setup failed");
 	while ((commit = get_revision(&revs)) != NULL) {
 		struct object *o = &(commit->object);
-		if (repo_in_merge_bases(repo, b, commit))
+		int ret = repo_in_merge_bases(repo, b, commit);
+		if (ret < 0) {
+			object_array_clear(&merges);
+			release_revisions(&revs);
+			return ret;
+		}
+		if (ret)
 			add_object_array(o, NULL, &merges);
 	}
 	reset_revision_walk();
@@ -1159,9 +1165,17 @@ static int find_first_merges(struct repository *repo,
 		contains_another = 0;
 		for (j = 0; j < merges.nr; j++) {
 			struct commit *m2 = (struct commit *) merges.objects[j].item;
-			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
-				contains_another = 1;
-				break;
+			if (i != j) {
+				int ret = repo_in_merge_bases(repo, m2, m1);
+				if (ret < 0) {
+					object_array_clear(&merges);
+					release_revisions(&revs);
+					return ret;
+				}
+				if (ret > 0) {
+					contains_another = 1;
+					break;
+				}
 			}
 		}
 
@@ -1197,7 +1211,7 @@ static int merge_submodule(struct merge_options *opt,
 			   const struct object_id *b)
 {
 	struct repository subrepo;
-	int ret = 0;
+	int ret = 0, ret2;
 	struct commit *commit_base, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
@@ -1234,14 +1248,29 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* check whether both changes are forward */
-	if (!repo_in_merge_bases(&subrepo, commit_base, commit_a) ||
-	    !repo_in_merge_bases(&subrepo, commit_base, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_a);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2 > 0)
+		ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_b);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (!ret2) {
 		output(opt, 1, _("Failed to merge submodule %s (commits don't follow merge-base)"), path);
 		goto cleanup;
 	}
 
 	/* Case #1: a is contained in b or vice versa */
-	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2) {
 		oidcpy(result, b);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
@@ -1254,7 +1283,12 @@ static int merge_submodule(struct merge_options *opt,
 		ret = 1;
 		goto cleanup;
 	}
-	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2) {
 		oidcpy(result, a);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
@@ -1402,6 +1436,8 @@ static int merge_mode_and_contents(struct merge_options *opt,
 							&o->oid,
 							&a->oid,
 							&b->oid);
+			if (result->clean < 0)
+				return -1;
 		} else if (S_ISLNK(a->mode)) {
 			switch (opt->recursive_variant) {
 			case MERGE_VARIANT_NORMAL:
diff --git a/shallow.c b/shallow.c
index dfcc1f86a7f..f71496f35c3 100644
--- a/shallow.c
+++ b/shallow.c
@@ -795,12 +795,16 @@ static void post_assign_shallow(struct shallow_info *info,
 		if (!*bitmap)
 			continue;
 		for (j = 0; j < bitmap_nr; j++)
-			if (bitmap[0][j] &&
-			    /* Step 7, reachability test at commit level */
-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1)) {
-				update_refstatus(ref_status, info->ref->nr, *bitmap);
-				dst++;
-				break;
+			if (bitmap[0][j]) {
+				/* Step 7, reachability test at commit level */
+				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1);
+				if (ret < 0)
+					exit(128);
+				if (!ret) {
+					update_refstatus(ref_status, info->ref->nr, *bitmap);
+					dst++;
+					break;
+				}
 			}
 	}
 	info->nr_ours = dst;
@@ -830,6 +834,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 							    si->nr_commits,
 							    si->commits,
 							    1);
+		if (si->reachable[c] < 0)
+			exit(128);
 		si->need_reachability_test[c] = 0;
 	}
 	return si->reachable[c];
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 04/11] Prepare `paint_down_to_common()` for handling shallow commits
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (2 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When `git fetch --update-shallow` needs to test for commit ancestry, it
can naturally run into a missing object (e.g. if it is a parent of a
shallow commit). To accommodate, the merge base logic will need to be
able to handle this situation gracefully.

Currently, that logic pretends that a missing parent commit is
equivalent to a missing parent commit, and for the purpose of
`--update-shallow` that is exactly what we need it to do.

Therefore, let's introduce a flag to indicate when that is precisely the
logic we want.

We need a flag, and cannot rely on `is_repository_shallow()` to indicate
that situation, because that function would return 0: There may not
actually be a `shallow` file, as demonstrated e.g. by t5537.10 ("add new
shallow root with receive.updateshallow on") and t5538.4 ("add new
shallow root with receive.updateshallow on").

Note: shallow commits' parents are set to `NULL` internally already,
therefore there is no need to special-case shallow repositories here, as
the merge-base logic will not try to access parent commits of shallow
commits.

Likewise, partial clones aren't an issue either: If a commit is missing
during the revision walk in the merge-base logic, it is fetched via
`promisor_remote_get_direct()`. And not only the single missing commit
object: Due to the way the "promised" objects are fetched (in
`fetch_objects()` in `promisor-remote.c`, using `fetch
--filter=blob:none`), there is no actual way to fetch a single commit
object, as the remote side will pass that commit OID to `pack-objects
--revs [...]` which in turn passes it to `rev-list` which interprets
this as a commit _range_ instead of a single object. Therefore, in
partial clones (unless they are shallow in addition), all commits
reachable from a commit that is in the local object database are also
present in that local database.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 5ff71d72d51..7112b10eeea 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -53,7 +53,8 @@ static int queue_has_nonstale(struct prio_queue *queue)
 static struct commit_list *paint_down_to_common(struct repository *r,
 						struct commit *one, int n,
 						struct commit **twos,
-						timestamp_t min_generation)
+						timestamp_t min_generation,
+						int ignore_missing_commits)
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *result = NULL;
@@ -108,6 +109,13 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 			if (repo_parse_commit(r, p)) {
 				clear_prio_queue(&queue);
 				free_commit_list(result);
+				/*
+				 * At this stage, we know that the commit is
+				 * missing: `repo_parse_commit()` uses
+				 * `OBJECT_INFO_DIE_IF_CORRUPT` and therefore
+				 * corrupt commits would already have been
+				 * dispatched with a `die()`.
+				 */
 				return NULL;
 			}
 			p->object.flags |= flags;
@@ -143,7 +151,7 @@ static struct commit_list *merge_bases_many(struct repository *r,
 			return NULL;
 	}
 
-	list = paint_down_to_common(r, one, n, twos, 0);
+	list = paint_down_to_common(r, one, n, twos, 0, 0);
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -214,7 +222,7 @@ static int remove_redundant_no_gen(struct repository *r,
 				min_generation = curr_generation;
 		}
 		common = paint_down_to_common(r, array[i], filled,
-					      work, min_generation);
+					      work, min_generation, 0);
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -504,7 +512,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 
 	bases = paint_down_to_common(r, commit,
 				     nr_reference, reference,
-				     generation);
+				     generation, ignore_missing_commits);
 	if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 05/11] commit-reach: start reporting errors in `paint_down_to_common()`
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (3 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

If a commit cannot be parsed, it is currently ignored when looking for
merge bases. That's undesirable as the operation can pretend success in
a corrupt repository, even though the command should fail with an error
message.

Let's start at the bottom of the stack by teaching the
`paint_down_to_common()` function to return an `int`: if negative, it
indicates fatal error, if 0 success.

This requires a couple of callers to be adjusted accordingly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 66 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 45 insertions(+), 21 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 7112b10eeea..9148a7dcbc0 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -50,14 +50,14 @@ static int queue_has_nonstale(struct prio_queue *queue)
 }
 
 /* all input commits in one and twos[] must have been parsed! */
-static struct commit_list *paint_down_to_common(struct repository *r,
-						struct commit *one, int n,
-						struct commit **twos,
-						timestamp_t min_generation,
-						int ignore_missing_commits)
+static int paint_down_to_common(struct repository *r,
+				struct commit *one, int n,
+				struct commit **twos,
+				timestamp_t min_generation,
+				int ignore_missing_commits,
+				struct commit_list **result)
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
-	struct commit_list *result = NULL;
 	int i;
 	timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
 
@@ -66,8 +66,8 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 
 	one->object.flags |= PARENT1;
 	if (!n) {
-		commit_list_append(one, &result);
-		return result;
+		commit_list_append(one, result);
+		return 0;
 	}
 	prio_queue_put(&queue, one);
 
@@ -95,7 +95,7 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 		if (flags == (PARENT1 | PARENT2)) {
 			if (!(commit->object.flags & RESULT)) {
 				commit->object.flags |= RESULT;
-				commit_list_insert_by_date(commit, &result);
+				commit_list_insert_by_date(commit, result);
 			}
 			/* Mark parents of a found merge stale */
 			flags |= STALE;
@@ -108,7 +108,8 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 				continue;
 			if (repo_parse_commit(r, p)) {
 				clear_prio_queue(&queue);
-				free_commit_list(result);
+				free_commit_list(*result);
+				*result = NULL;
 				/*
 				 * At this stage, we know that the commit is
 				 * missing: `repo_parse_commit()` uses
@@ -116,7 +117,10 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 				 * corrupt commits would already have been
 				 * dispatched with a `die()`.
 				 */
-				return NULL;
+				if (ignore_missing_commits)
+					return 0;
+				return error(_("could not parse commit %s"),
+					     oid_to_hex(&p->object.oid));
 			}
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
@@ -124,7 +128,7 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 	}
 
 	clear_prio_queue(&queue);
-	return result;
+	return 0;
 }
 
 static struct commit_list *merge_bases_many(struct repository *r,
@@ -151,7 +155,10 @@ static struct commit_list *merge_bases_many(struct repository *r,
 			return NULL;
 	}
 
-	list = paint_down_to_common(r, one, n, twos, 0, 0);
+	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
+		free_commit_list(list);
+		return NULL;
+	}
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -205,7 +212,7 @@ static int remove_redundant_no_gen(struct repository *r,
 	for (i = 0; i < cnt; i++)
 		repo_parse_commit(r, array[i]);
 	for (i = 0; i < cnt; i++) {
-		struct commit_list *common;
+		struct commit_list *common = NULL;
 		timestamp_t min_generation = commit_graph_generation(array[i]);
 
 		if (redundant[i])
@@ -221,8 +228,16 @@ static int remove_redundant_no_gen(struct repository *r,
 			if (curr_generation < min_generation)
 				min_generation = curr_generation;
 		}
-		common = paint_down_to_common(r, array[i], filled,
-					      work, min_generation, 0);
+		if (paint_down_to_common(r, array[i], filled,
+					 work, min_generation, 0, &common)) {
+			clear_commit_marks(array[i], all_flags);
+			clear_commit_marks_many(filled, work, all_flags);
+			free_commit_list(common);
+			free(work);
+			free(redundant);
+			free(filled_index);
+			return -1;
+		}
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -422,6 +437,10 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	clear_commit_marks_many(n, twos, all_flags);
 
 	cnt = remove_redundant(r, rslt, cnt);
+	if (cnt < 0) {
+		free(rslt);
+		return NULL;
+	}
 	result = NULL;
 	for (i = 0; i < cnt; i++)
 		commit_list_insert_by_date(rslt[i], &result);
@@ -491,7 +510,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 			     int nr_reference, struct commit **reference,
 			     int ignore_missing_commits)
 {
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	int ret = 0, i;
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
@@ -510,10 +529,11 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 	if (generation > max_generation)
 		return ret;
 
-	bases = paint_down_to_common(r, commit,
-				     nr_reference, reference,
-				     generation, ignore_missing_commits);
-	if (commit->object.flags & PARENT2)
+	if (paint_down_to_common(r, commit,
+				 nr_reference, reference,
+				 generation, ignore_missing_commits, &bases))
+		ret = -1;
+	else if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
 	clear_commit_marks_many(nr_reference, reference, all_flags);
@@ -566,6 +586,10 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 		}
 	}
 	num_head = remove_redundant(the_repository, array, num_head);
+	if (num_head < 0) {
+		free(array);
+		return NULL;
+	}
 	for (i = 0; i < num_head; i++)
 		tail = &commit_list_insert(array[i], tail)->next;
 	free(array);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()`
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (4 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `paint_down_to_common()` function was just taught to indicate
parsing errors, and now the `merge_bases_many()` function is aware of
that, too.

One tricky aspect is that `merge_bases_many()` parses commits of its
own, but wants to gracefully handle the scenario where NULL is passed as
a merge head, returning the empty list of merge bases. The way this was
handled involved calling `repo_parse_commit(NULL)` and relying on it to
return an error. This has to be done differently now so that we can
handle missing commits correctly by producing a fatal error.

Next step: adjust the caller of `merge_bases_many()`:
`get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 9148a7dcbc0..2c74583c8e0 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -131,41 +131,49 @@ static int paint_down_to_common(struct repository *r,
 	return 0;
 }
 
-static struct commit_list *merge_bases_many(struct repository *r,
-					    struct commit *one, int n,
-					    struct commit **twos)
+static int merge_bases_many(struct repository *r,
+			    struct commit *one, int n,
+			    struct commit **twos,
+			    struct commit_list **result)
 {
 	struct commit_list *list = NULL;
-	struct commit_list *result = NULL;
 	int i;
 
 	for (i = 0; i < n; i++) {
-		if (one == twos[i])
+		if (one == twos[i]) {
 			/*
 			 * We do not mark this even with RESULT so we do not
 			 * have to clean it up.
 			 */
-			return commit_list_insert(one, &result);
+			*result = commit_list_insert(one, result);
+			return 0;
+		}
 	}
 
+	if (!one)
+		return 0;
 	if (repo_parse_commit(r, one))
-		return NULL;
+		return error(_("could not parse commit %s"),
+			     oid_to_hex(&one->object.oid));
 	for (i = 0; i < n; i++) {
+		if (!twos[i])
+			return 0;
 		if (repo_parse_commit(r, twos[i]))
-			return NULL;
+			return error(_("could not parse commit %s"),
+				     oid_to_hex(&twos[i]->object.oid));
 	}
 
 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
 		free_commit_list(list);
-		return NULL;
+		return -1;
 	}
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
 		if (!(commit->object.flags & STALE))
-			commit_list_insert_by_date(commit, &result);
+			commit_list_insert_by_date(commit, result);
 	}
-	return result;
+	return 0;
 }
 
 struct commit_list *get_octopus_merge_bases(struct commit_list *in)
@@ -410,10 +418,11 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 {
 	struct commit_list *list;
 	struct commit **rslt;
-	struct commit_list *result;
+	struct commit_list *result = NULL;
 	int cnt, i;
 
-	result = merge_bases_many(r, one, n, twos);
+	if (merge_bases_many(r, one, n, twos, &result) < 0)
+		return NULL;
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
 			return result;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (5 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 08/11] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate
parsing errors, and now the `get_merge_bases_many_0()` function is aware
of that, too.

Next step: adjust the callers of `get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 57 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 2c74583c8e0..5fa0abc7d1e 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -410,37 +410,38 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt
 	return remove_redundant_no_gen(r, array, cnt);
 }
 
-static struct commit_list *get_merge_bases_many_0(struct repository *r,
-						  struct commit *one,
-						  int n,
-						  struct commit **twos,
-						  int cleanup)
+static int get_merge_bases_many_0(struct repository *r,
+				  struct commit *one,
+				  int n,
+				  struct commit **twos,
+				  int cleanup,
+				  struct commit_list **result)
 {
 	struct commit_list *list;
 	struct commit **rslt;
-	struct commit_list *result = NULL;
 	int cnt, i;
 
-	if (merge_bases_many(r, one, n, twos, &result) < 0)
-		return NULL;
+	if (merge_bases_many(r, one, n, twos, result) < 0)
+		return -1;
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
-			return result;
+			return 0;
 	}
-	if (!result || !result->next) {
+	if (!*result || !(*result)->next) {
 		if (cleanup) {
 			clear_commit_marks(one, all_flags);
 			clear_commit_marks_many(n, twos, all_flags);
 		}
-		return result;
+		return 0;
 	}
 
 	/* There are more than one */
-	cnt = commit_list_count(result);
+	cnt = commit_list_count(*result);
 	CALLOC_ARRAY(rslt, cnt);
-	for (list = result, i = 0; list; list = list->next)
+	for (list = *result, i = 0; list; list = list->next)
 		rslt[i++] = list->item;
-	free_commit_list(result);
+	free_commit_list(*result);
+	*result = NULL;
 
 	clear_commit_marks(one, all_flags);
 	clear_commit_marks_many(n, twos, all_flags);
@@ -448,13 +449,12 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	cnt = remove_redundant(r, rslt, cnt);
 	if (cnt < 0) {
 		free(rslt);
-		return NULL;
+		return -1;
 	}
-	result = NULL;
 	for (i = 0; i < cnt; i++)
-		commit_list_insert_by_date(rslt[i], &result);
+		commit_list_insert_by_date(rslt[i], result);
 	free(rslt);
-	return result;
+	return 0;
 }
 
 struct commit_list *repo_get_merge_bases_many(struct repository *r,
@@ -462,7 +462,12 @@ struct commit_list *repo_get_merge_bases_many(struct repository *r,
 					      int n,
 					      struct commit **twos)
 {
-	return get_merge_bases_many_0(r, one, n, twos, 1);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, n, twos, 1, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
@@ -470,14 +475,24 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    int n,
 						    struct commit **twos)
 {
-	return get_merge_bases_many_0(r, one, n, twos, 0);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, n, twos, 0, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 struct commit_list *repo_get_merge_bases(struct repository *r,
 					 struct commit *one,
 					 struct commit *two)
 {
-	return get_merge_bases_many_0(r, one, 1, &two, 1);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, 1, &two, 1, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 /*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 08/11] repo_get_merge_bases(): pass on errors from `merge_bases_many()`
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (6 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 09/11] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `repo_get_merge_bases()` macro) is aware of that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next step: adjust the callers of `get_octopus_merge_bases()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/log.c                    | 10 +++++-----
 builtin/merge-tree.c             |  5 +++--
 builtin/merge.c                  | 20 ++++++++++++--------
 builtin/rebase.c                 |  8 +++++---
 builtin/rev-parse.c              |  5 +++--
 commit-reach.c                   | 23 +++++++++++------------
 commit-reach.h                   |  7 ++++---
 diff-lib.c                       |  5 +++--
 log-tree.c                       |  5 +++--
 merge-ort.c                      |  6 +++++-
 merge-recursive.c                |  4 +++-
 notes-merge.c                    |  3 ++-
 object-name.c                    |  7 +++++--
 revision.c                       | 12 ++++++++----
 sequencer.c                      |  8 ++++++--
 submodule.c                      |  7 ++++++-
 t/t4301-merge-tree-write-tree.sh | 12 ++++++++++++
 17 files changed, 96 insertions(+), 51 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 1705da71aca..befafd6ae04 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1702,11 +1702,11 @@ static struct commit *get_base_commit(const char *base_commit,
 	 */
 	while (rev_nr > 1) {
 		for (i = 0; i < rev_nr / 2; i++) {
-			struct commit_list *merge_base;
-			merge_base = repo_get_merge_bases(the_repository,
-							  rev[2 * i],
-							  rev[2 * i + 1]);
-			if (!merge_base || merge_base->next) {
+			struct commit_list *merge_base = NULL;
+			if (repo_get_merge_bases(the_repository,
+						 rev[2 * i],
+						 rev[2 * i + 1], &merge_base) < 0 ||
+			    !merge_base || merge_base->next) {
 				if (die_on_failure) {
 					die(_("failed to find exact merge base"));
 				} else {
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index a35e0452d66..76200250629 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -463,8 +463,9 @@ static int real_merge(struct merge_tree_options *o,
 		 * Get the merge bases, in reverse order; see comment above
 		 * merge_incore_recursive in merge-ort.h
 		 */
-		merge_bases = repo_get_merge_bases(the_repository, parent1,
-						   parent2);
+		if (repo_get_merge_bases(the_repository, parent1,
+					 parent2, &merge_bases) < 0)
+			exit(128);
 		if (!merge_bases && !o->allow_unrelated_histories)
 			die(_("refusing to merge unrelated histories"));
 		merge_bases = reverse_commit_list(merge_bases);
diff --git a/builtin/merge.c b/builtin/merge.c
index d748d46e135..ac9d58adc29 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1517,10 +1517,13 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 
 	if (!remoteheads)
 		; /* already up-to-date */
-	else if (!remoteheads->next)
-		common = repo_get_merge_bases(the_repository, head_commit,
-					      remoteheads->item);
-	else {
+	else if (!remoteheads->next) {
+		if (repo_get_merge_bases(the_repository, head_commit,
+					 remoteheads->item, &common) < 0) {
+			ret = 2;
+			goto done;
+		}
+	} else {
 		struct commit_list *list = remoteheads;
 		commit_list_insert(head_commit, &list);
 		common = get_octopus_merge_bases(list);
@@ -1631,7 +1634,7 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 		struct commit_list *j;
 
 		for (j = remoteheads; j; j = j->next) {
-			struct commit_list *common_one;
+			struct commit_list *common_one = NULL;
 			struct commit *common_item;
 
 			/*
@@ -1639,9 +1642,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 			 * merge_bases again, otherwise "git merge HEAD^
 			 * HEAD^^" would be missed.
 			 */
-			common_one = repo_get_merge_bases(the_repository,
-							  head_commit,
-							  j->item);
+			if (repo_get_merge_bases(the_repository, head_commit,
+						 j->item, &common_one) < 0)
+				exit(128);
+
 			common_item = common_one->item;
 			free_commit_list(common_one);
 			if (!oideq(&common_item->object.oid, &j->item->object.oid)) {
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 043c65dccd9..06a55fc7325 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -879,7 +879,8 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream,
 	if (!upstream)
 		goto done;
 
-	merge_bases = repo_get_merge_bases(the_repository, upstream, head);
+	if (repo_get_merge_bases(the_repository, upstream, head, &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases || merge_bases->next)
 		goto done;
 
@@ -898,8 +899,9 @@ static void fill_branch_base(struct rebase_options *options,
 {
 	struct commit_list *merge_bases = NULL;
 
-	merge_bases = repo_get_merge_bases(the_repository, options->onto,
-					   options->orig_head);
+	if (repo_get_merge_bases(the_repository, options->onto,
+				 options->orig_head, &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases || merge_bases->next)
 		oidcpy(branch_base, null_oid());
 	else
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index fde8861ca4e..c97d0f6144c 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -297,7 +297,7 @@ static int try_difference(const char *arg)
 		show_rev(NORMAL, &end_oid, end);
 		show_rev(symmetric ? NORMAL : REVERSED, &start_oid, start);
 		if (symmetric) {
-			struct commit_list *exclude;
+			struct commit_list *exclude = NULL;
 			struct commit *a, *b;
 			a = lookup_commit_reference(the_repository, &start_oid);
 			b = lookup_commit_reference(the_repository, &end_oid);
@@ -305,7 +305,8 @@ static int try_difference(const char *arg)
 				*dotdot = '.';
 				return 0;
 			}
-			exclude = repo_get_merge_bases(the_repository, a, b);
+			if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0)
+				exit(128);
 			while (exclude) {
 				struct commit *commit = pop_commit(&exclude);
 				show_rev(REVERSED, &commit->object.oid, NULL);
diff --git a/commit-reach.c b/commit-reach.c
index 5fa0abc7d1e..10e625ff51b 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -189,9 +189,12 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 		struct commit_list *new_commits = NULL, *end = NULL;
 
 		for (j = ret; j; j = j->next) {
-			struct commit_list *bases;
-			bases = repo_get_merge_bases(the_repository, i->item,
-						     j->item);
+			struct commit_list *bases = NULL;
+			if (repo_get_merge_bases(the_repository, i->item,
+						 j->item, &bases) < 0) {
+				free_commit_list(bases);
+				return NULL;
+			}
 			if (!new_commits)
 				new_commits = bases;
 			else
@@ -483,16 +486,12 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 	return result;
 }
 
-struct commit_list *repo_get_merge_bases(struct repository *r,
-					 struct commit *one,
-					 struct commit *two)
+int repo_get_merge_bases(struct repository *r,
+			 struct commit *one,
+			 struct commit *two,
+			 struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, 1, &two, 1, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, 1, &two, 1, result);
 }
 
 /*
diff --git a/commit-reach.h b/commit-reach.h
index 68f81549a44..2c6fcdd34f6 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -9,9 +9,10 @@ struct ref_filter;
 struct object_id;
 struct object_array;
 
-struct commit_list *repo_get_merge_bases(struct repository *r,
-					 struct commit *rev1,
-					 struct commit *rev2);
+int repo_get_merge_bases(struct repository *r,
+			 struct commit *rev1,
+			 struct commit *rev2,
+			 struct commit_list **result);
 struct commit_list *repo_get_merge_bases_many(struct repository *r,
 					      struct commit *one, int n,
 					      struct commit **twos);
diff --git a/diff-lib.c b/diff-lib.c
index 0e9ec4f68af..498224ccce2 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -565,7 +565,7 @@ void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb)
 {
 	int i;
 	struct commit *mb_child[2] = {0};
-	struct commit_list *merge_bases;
+	struct commit_list *merge_bases = NULL;
 
 	for (i = 0; i < revs->pending.nr; i++) {
 		struct object *obj = revs->pending.objects[i].item;
@@ -592,7 +592,8 @@ void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb)
 		mb_child[1] = lookup_commit_reference(the_repository, &oid);
 	}
 
-	merge_bases = repo_get_merge_bases(the_repository, mb_child[0], mb_child[1]);
+	if (repo_get_merge_bases(the_repository, mb_child[0], mb_child[1], &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases)
 		die(_("no merge base found"));
 	if (merge_bases->next)
diff --git a/log-tree.c b/log-tree.c
index 504da6b519e..4f337766a39 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1010,7 +1010,7 @@ static int do_remerge_diff(struct rev_info *opt,
 			   struct object_id *oid)
 {
 	struct merge_options o;
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	struct merge_result res = {0};
 	struct pretty_print_context ctx = {0};
 	struct commit *parent1 = parents->item;
@@ -1035,7 +1035,8 @@ static int do_remerge_diff(struct rev_info *opt,
 	/* Parse the relevant commits and get the merge bases */
 	parse_commit_or_die(parent1);
 	parse_commit_or_die(parent2);
-	bases = repo_get_merge_bases(the_repository, parent1, parent2);
+	if (repo_get_merge_bases(the_repository, parent1, parent2, &bases) < 0)
+		exit(128);
 
 	/* Re-merge the parents */
 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
diff --git a/merge-ort.c b/merge-ort.c
index 9f3af46333a..90d8495ca1f 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -5066,7 +5066,11 @@ static void merge_ort_internal(struct merge_options *opt,
 	struct strbuf merge_base_abbrev = STRBUF_INIT;
 
 	if (!merge_bases) {
-		merge_bases = repo_get_merge_bases(the_repository, h1, h2);
+		if (repo_get_merge_bases(the_repository, h1, h2,
+					 &merge_bases) < 0) {
+			result->clean = -1;
+			return;
+		}
 		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
 		merge_bases = reverse_commit_list(merge_bases);
 	}
diff --git a/merge-recursive.c b/merge-recursive.c
index 0d931cc14ad..d609373d960 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3638,7 +3638,9 @@ static int merge_recursive_internal(struct merge_options *opt,
 	}
 
 	if (!merge_bases) {
-		merge_bases = repo_get_merge_bases(the_repository, h1, h2);
+		if (repo_get_merge_bases(the_repository, h1, h2,
+					 &merge_bases) < 0)
+			return -1;
 		merge_bases = reverse_commit_list(merge_bases);
 	}
 
diff --git a/notes-merge.c b/notes-merge.c
index 8799b522a55..51282934ae6 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -607,7 +607,8 @@ int notes_merge(struct notes_merge_options *o,
 	assert(local && remote);
 
 	/* Find merge bases */
-	bases = repo_get_merge_bases(the_repository, local, remote);
+	if (repo_get_merge_bases(the_repository, local, remote, &bases) < 0)
+		exit(128);
 	if (!bases) {
 		base_oid = null_oid();
 		base_tree_oid = the_hash_algo->empty_tree;
diff --git a/object-name.c b/object-name.c
index 0bfa29dbbfe..63bec6d9a2b 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1481,7 +1481,7 @@ int repo_get_oid_mb(struct repository *r,
 		    struct object_id *oid)
 {
 	struct commit *one, *two;
-	struct commit_list *mbs;
+	struct commit_list *mbs = NULL;
 	struct object_id oid_tmp;
 	const char *dots;
 	int st;
@@ -1509,7 +1509,10 @@ int repo_get_oid_mb(struct repository *r,
 	two = lookup_commit_reference_gently(r, &oid_tmp, 0);
 	if (!two)
 		return -1;
-	mbs = repo_get_merge_bases(r, one, two);
+	if (repo_get_merge_bases(r, one, two, &mbs) < 0) {
+		free_commit_list(mbs);
+		return -1;
+	}
 	if (!mbs || mbs->next)
 		st = -1;
 	else {
diff --git a/revision.c b/revision.c
index 00d5c29bfce..eb0d550842f 100644
--- a/revision.c
+++ b/revision.c
@@ -1965,7 +1965,7 @@ static void add_pending_commit_list(struct rev_info *revs,
 
 static void prepare_show_merge(struct rev_info *revs)
 {
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	struct commit *head, *other;
 	struct object_id oid;
 	const char **prune = NULL;
@@ -1980,7 +1980,8 @@ static void prepare_show_merge(struct rev_info *revs)
 	other = lookup_commit_or_die(&oid, "MERGE_HEAD");
 	add_pending_object(revs, &head->object, "HEAD");
 	add_pending_object(revs, &other->object, "MERGE_HEAD");
-	bases = repo_get_merge_bases(the_repository, head, other);
+	if (repo_get_merge_bases(the_repository, head, other, &bases) < 0)
+		exit(128);
 	add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING | BOTTOM);
 	add_pending_commit_list(revs, bases, UNINTERESTING | BOTTOM);
 	free_commit_list(bases);
@@ -2068,14 +2069,17 @@ static int handle_dotdot_1(const char *arg, char *dotdot,
 	} else {
 		/* A...B -- find merge bases between the two */
 		struct commit *a, *b;
-		struct commit_list *exclude;
+		struct commit_list *exclude = NULL;
 
 		a = lookup_commit_reference(revs->repo, &a_obj->oid);
 		b = lookup_commit_reference(revs->repo, &b_obj->oid);
 		if (!a || !b)
 			return dotdot_missing(arg, dotdot, revs, symmetric);
 
-		exclude = repo_get_merge_bases(the_repository, a, b);
+		if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0) {
+			free_commit_list(exclude);
+			return -1;
+		}
 		add_rev_cmdline_list(revs, exclude, REV_CMD_MERGE_BASE,
 				     flags_exclude);
 		add_pending_commit_list(revs, exclude, flags_exclude);
diff --git a/sequencer.c b/sequencer.c
index d584cac8ed9..4417f2f1956 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3913,7 +3913,7 @@ static int do_merge(struct repository *r,
 	int run_commit_flags = 0;
 	struct strbuf ref_name = STRBUF_INIT;
 	struct commit *head_commit, *merge_commit, *i;
-	struct commit_list *bases, *j;
+	struct commit_list *bases = NULL, *j;
 	struct commit_list *to_merge = NULL, **tail = &to_merge;
 	const char *strategy = !opts->xopts.nr &&
 		(!opts->strategy ||
@@ -4139,7 +4139,11 @@ static int do_merge(struct repository *r,
 	}
 
 	merge_commit = to_merge->item;
-	bases = repo_get_merge_bases(r, head_commit, merge_commit);
+	if (repo_get_merge_bases(r, head_commit, merge_commit, &bases) < 0) {
+		ret = -1;
+		goto leave_merge;
+	}
+
 	if (bases && oideq(&merge_commit->object.oid,
 			   &bases->item->object.oid)) {
 		ret = 0;
diff --git a/submodule.c b/submodule.c
index e603a19a876..04931a5474b 100644
--- a/submodule.c
+++ b/submodule.c
@@ -595,7 +595,12 @@ static void show_submodule_header(struct diff_options *o,
 	     (!is_null_oid(two) && !*right))
 		message = "(commits not present)";
 
-	*merge_bases = repo_get_merge_bases(sub, *left, *right);
+	*merge_bases = NULL;
+	if (repo_get_merge_bases(sub, *left, *right, merge_bases) < 0) {
+		message = "(corrupt repository)";
+		goto output_header;
+	}
+
 	if (*merge_bases) {
 		if ((*merge_bases)->item == *left)
 			fast_forward = 1;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index b2c8a43fce3..5d1e7aca4c8 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -945,4 +945,16 @@ test_expect_success 'check the input format when --stdin is passed' '
 	test_cmp expect actual
 '
 
+test_expect_success 'error out on missing commits as well' '
+	git init --bare missing-commit.git &&
+	git rev-list --objects side1 side3 >list-including-initial &&
+	grep -v ^$(git rev-parse side1^) <list-including-initial >list &&
+	git pack-objects missing-commit.git/objects/pack/missing-initial <list &&
+	side1=$(git rev-parse side1) &&
+	side3=$(git rev-parse side3) &&
+	test_must_fail git --git-dir=missing-commit.git \
+		merge-tree --allow-unrelated-histories $side1 $side3 >actual &&
+	test_must_be_empty actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 09/11] get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (7 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 08/11] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 10/11] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `get_merge_bases()` macro) is aware of that, too.

Naturally, the callers need to be adjusted now, too.

Next step: adjust `repo_get_merge_bases_many()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/merge-base.c |  8 ++++++--
 builtin/merge.c      |  6 +++++-
 builtin/pull.c       |  5 +++--
 commit-reach.c       | 20 +++++++++++---------
 commit-reach.h       |  2 +-
 5 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 0308fd73289..2edffc5487e 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -77,13 +77,17 @@ static int handle_independent(int count, const char **args)
 static int handle_octopus(int count, const char **args, int show_all)
 {
 	struct commit_list *revs = NULL;
-	struct commit_list *result, *rev;
+	struct commit_list *result = NULL, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = get_octopus_merge_bases(revs);
+	if (get_octopus_merge_bases(revs, &result) < 0) {
+		free_commit_list(revs);
+		free_commit_list(result);
+		return 128;
+	}
 	free_commit_list(revs);
 	reduce_heads_replace(&result);
 
diff --git a/builtin/merge.c b/builtin/merge.c
index ac9d58adc29..94c5b693972 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1526,7 +1526,11 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	} else {
 		struct commit_list *list = remoteheads;
 		commit_list_insert(head_commit, &list);
-		common = get_octopus_merge_bases(list);
+		if (get_octopus_merge_bases(list, &common) < 0) {
+			free(list);
+			ret = 2;
+			goto done;
+		}
 		free(list);
 	}
 
diff --git a/builtin/pull.c b/builtin/pull.c
index e6f2942c0c5..0c5a55f2f4d 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -820,7 +820,7 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 		const struct object_id *merge_head,
 		const struct object_id *fork_point)
 {
-	struct commit_list *revs = NULL, *result;
+	struct commit_list *revs = NULL, *result = NULL;
 
 	commit_list_insert(lookup_commit_reference(the_repository, curr_head),
 			   &revs);
@@ -830,7 +830,8 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 		commit_list_insert(lookup_commit_reference(the_repository, fork_point),
 				   &revs);
 
-	result = get_octopus_merge_bases(revs);
+	if (get_octopus_merge_bases(revs, &result) < 0)
+		exit(128);
 	free_commit_list(revs);
 	reduce_heads_replace(&result);
 
diff --git a/commit-reach.c b/commit-reach.c
index 10e625ff51b..fa21a8f2f6b 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -176,24 +176,26 @@ static int merge_bases_many(struct repository *r,
 	return 0;
 }
 
-struct commit_list *get_octopus_merge_bases(struct commit_list *in)
+int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result)
 {
-	struct commit_list *i, *j, *k, *ret = NULL;
+	struct commit_list *i, *j, *k;
 
 	if (!in)
-		return ret;
+		return 0;
 
-	commit_list_insert(in->item, &ret);
+	commit_list_insert(in->item, result);
 
 	for (i = in->next; i; i = i->next) {
 		struct commit_list *new_commits = NULL, *end = NULL;
 
-		for (j = ret; j; j = j->next) {
+		for (j = *result; j; j = j->next) {
 			struct commit_list *bases = NULL;
 			if (repo_get_merge_bases(the_repository, i->item,
 						 j->item, &bases) < 0) {
 				free_commit_list(bases);
-				return NULL;
+				free_commit_list(*result);
+				*result = NULL;
+				return -1;
 			}
 			if (!new_commits)
 				new_commits = bases;
@@ -202,10 +204,10 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 			for (k = bases; k; k = k->next)
 				end = k;
 		}
-		free_commit_list(ret);
-		ret = new_commits;
+		free_commit_list(*result);
+		*result = new_commits;
 	}
-	return ret;
+	return 0;
 }
 
 static int remove_redundant_no_gen(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 2c6fcdd34f6..4690b6ecd0c 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -21,7 +21,7 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    struct commit *one, int n,
 						    struct commit **twos);
 
-struct commit_list *get_octopus_merge_bases(struct commit_list *in);
+int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
 
 int repo_is_descendant_of(struct repository *r,
 			  struct commit *commit,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 10/11] repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (8 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 09/11] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-22 13:21   ` [PATCH v2 11/11] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many()` function is aware of
that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next stop: `repo_get_merge_bases_dirty()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 bisect.c              |  7 ++++---
 builtin/log.c         | 13 +++++++------
 commit-reach.c        | 16 ++++++----------
 commit-reach.h        |  7 ++++---
 commit.c              |  7 ++++---
 t/helper/test-reach.c |  9 ++++++---
 6 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/bisect.c b/bisect.c
index 1be8e0a2711..2018466d69f 100644
--- a/bisect.c
+++ b/bisect.c
@@ -851,10 +851,11 @@ static void handle_skipped_merge_base(const struct object_id *mb)
 static enum bisect_error check_merge_bases(int rev_nr, struct commit **rev, int no_checkout)
 {
 	enum bisect_error res = BISECT_OK;
-	struct commit_list *result;
+	struct commit_list *result = NULL;
 
-	result = repo_get_merge_bases_many(the_repository, rev[0], rev_nr - 1,
-					   rev + 1);
+	if (repo_get_merge_bases_many(the_repository, rev[0], rev_nr - 1,
+				      rev + 1, &result) < 0)
+		exit(128);
 
 	for (; result; result = result->next) {
 		const struct object_id *mb = &result->item->object.oid;
diff --git a/builtin/log.c b/builtin/log.c
index befafd6ae04..c75790a7cec 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1656,7 +1656,7 @@ static struct commit *get_base_commit(const char *base_commit,
 		struct branch *curr_branch = branch_get(NULL);
 		const char *upstream = branch_get_upstream(curr_branch, NULL);
 		if (upstream) {
-			struct commit_list *base_list;
+			struct commit_list *base_list = NULL;
 			struct commit *commit;
 			struct object_id oid;
 
@@ -1667,11 +1667,12 @@ static struct commit *get_base_commit(const char *base_commit,
 					return NULL;
 			}
 			commit = lookup_commit_or_die(&oid, "upstream base");
-			base_list = repo_get_merge_bases_many(the_repository,
-							      commit, total,
-							      list);
-			/* There should be one and only one merge base. */
-			if (!base_list || base_list->next) {
+			if (repo_get_merge_bases_many(the_repository,
+						      commit, total,
+						      list,
+						      &base_list) < 0 ||
+			    /* There should be one and only one merge base. */
+			    !base_list || base_list->next) {
 				if (die_on_failure) {
 					die(_("could not find exact merge base"));
 				} else {
diff --git a/commit-reach.c b/commit-reach.c
index fa21a8f2f6b..954a05399f1 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -462,17 +462,13 @@ static int get_merge_bases_many_0(struct repository *r,
 	return 0;
 }
 
-struct commit_list *repo_get_merge_bases_many(struct repository *r,
-					      struct commit *one,
-					      int n,
-					      struct commit **twos)
+int repo_get_merge_bases_many(struct repository *r,
+			      struct commit *one,
+			      int n,
+			      struct commit **twos,
+			      struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, n, twos, 1, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, n, twos, 1, result);
 }
 
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 4690b6ecd0c..458043f4d58 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -13,9 +13,10 @@ int repo_get_merge_bases(struct repository *r,
 			 struct commit *rev1,
 			 struct commit *rev2,
 			 struct commit_list **result);
-struct commit_list *repo_get_merge_bases_many(struct repository *r,
-					      struct commit *one, int n,
-					      struct commit **twos);
+int repo_get_merge_bases_many(struct repository *r,
+			      struct commit *one, int n,
+			      struct commit **twos,
+			      struct commit_list **result);
 /* To be used only when object flags after this call no longer matter */
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    struct commit *one, int n,
diff --git a/commit.c b/commit.c
index 8405d7c3fce..00add5d81c6 100644
--- a/commit.c
+++ b/commit.c
@@ -1054,7 +1054,7 @@ struct commit *get_fork_point(const char *refname, struct commit *commit)
 {
 	struct object_id oid;
 	struct rev_collect revs;
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	int i;
 	struct commit *ret = NULL;
 	char *full_refname;
@@ -1079,8 +1079,9 @@ struct commit *get_fork_point(const char *refname, struct commit *commit)
 	for (i = 0; i < revs.nr; i++)
 		revs.commit[i]->object.flags &= ~TMP_MARK;
 
-	bases = repo_get_merge_bases_many(the_repository, commit, revs.nr,
-					  revs.commit);
+	if (repo_get_merge_bases_many(the_repository, commit, revs.nr,
+				      revs.commit, &bases) < 0)
+		exit(128);
 
 	/*
 	 * There should be one and only one merge base, when we found
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index aa816e168ea..84ee9da8681 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -117,9 +117,12 @@ int cmd__reach(int ac, const char **av)
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
-		struct commit_list *list = repo_get_merge_bases_many(the_repository,
-								     A, X_nr,
-								     X_array);
+		struct commit_list *list = NULL;
+		if (repo_get_merge_bases_many(the_repository,
+					      A, X_nr,
+					      X_array,
+					      &list) < 0)
+			exit(128);
 		printf("%s(A,X):\n", av[1]);
 		print_sorted_commit_ids(list);
 	} else if (!strcmp(av[1], "reduce_heads")) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 11/11] repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (9 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 10/11] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
@ 2024-02-22 13:21   ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-22 13:21 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many_dirty()` function is
aware of that, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/merge-base.c |  9 ++++++---
 commit-reach.c       | 16 ++++++----------
 commit-reach.h       |  7 ++++---
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 2edffc5487e..a8a1ca53968 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -13,10 +13,13 @@
 
 static int show_merge_base(struct commit **rev, int rev_nr, int show_all)
 {
-	struct commit_list *result, *r;
+	struct commit_list *result = NULL, *r;
 
-	result = repo_get_merge_bases_many_dirty(the_repository, rev[0],
-						 rev_nr - 1, rev + 1);
+	if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
+					    rev_nr - 1, rev + 1, &result) < 0) {
+		free_commit_list(result);
+		return -1;
+	}
 
 	if (!result)
 		return 1;
diff --git a/commit-reach.c b/commit-reach.c
index 954a05399f1..2c69cb83d6f 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -471,17 +471,13 @@ int repo_get_merge_bases_many(struct repository *r,
 	return get_merge_bases_many_0(r, one, n, twos, 1, result);
 }
 
-struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
-						    struct commit *one,
-						    int n,
-						    struct commit **twos)
+int repo_get_merge_bases_many_dirty(struct repository *r,
+				    struct commit *one,
+				    int n,
+				    struct commit **twos,
+				    struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, n, twos, 0, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, n, twos, 0, result);
 }
 
 int repo_get_merge_bases(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 458043f4d58..bf63cc468fd 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -18,9 +18,10 @@ int repo_get_merge_bases_many(struct repository *r,
 			      struct commit **twos,
 			      struct commit_list **result);
 /* To be used only when object flags after this call no longer matter */
-struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
-						    struct commit *one, int n,
-						    struct commit **twos);
+int repo_get_merge_bases_many_dirty(struct repository *r,
+				    struct commit *one, int n,
+				    struct commit **twos,
+				    struct commit_list **result);
 
 int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()`
  2024-02-22 13:21   ` [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
@ 2024-02-24  0:33     ` Junio C Hamano
  2024-02-26  9:34       ` Johannes Schindelin
  2024-03-01  6:56     ` Jeff King
  1 sibling, 1 reply; 70+ messages in thread
From: Junio C Hamano @ 2024-02-24  0:33 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Patrick Steinhardt, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Some functions in Git's source code follow the convention that returning
> a negative value indicates a fatal error, e.g. repository corruption.
>
> Let's use this convention in `repo_in_merge_bases()` to report when one
> of the specified commits is missing (i.e. when `repo_parse_commit()`
> reports an error).
>
> Also adjust the callers of `repo_in_merge_bases()` to handle such
> negative return values.

All of the above makes sense, but I have to wonder if this hunk
should rather want to be part of the previous step:

> @@ -486,10 +488,10 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
>  	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
>  
>  	if (repo_parse_commit(r, commit))
> -		return ret;
> +		return ignore_missing_commits ? 0 : -1;
>  	for (i = 0; i < nr_reference; i++) {
>  		if (repo_parse_commit(r, reference[i]))
> -			return ret;
> +			return ignore_missing_commits ? 0 : -1;
>  
>  		generation = commit_graph_generation(reference[i]);
>  		if (generation > max_generation)

as this hunk is not about many callers of repo_in_merge_bases() that
ignored the return values, which are all fixed by this patch, but
about returning that error signal back to the caller.

Yes, I know you wrote in [02/11] that it does not change the
behaviour, and if you move this hunk to [02/11], it might change the
behaviour, but that is changing for the better.  Besides, adding a
parameter "ignore_missing" to the function only to be ignored until
the next patch feels rather incomplete.

The other changes in this patch about its primary theme, fixing the
callers that used to ignore return values of repo_in_merge_bases(),
all looked sensible.  This hunk somehow stood out like a sore thumb
to me.



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()`
  2024-02-24  0:33     ` Junio C Hamano
@ 2024-02-26  9:34       ` Johannes Schindelin
  2024-02-26 20:01         ` Junio C Hamano
  0 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin @ 2024-02-26  9:34 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, Patrick Steinhardt

Hi Junio,

On Fri, 23 Feb 2024, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > Some functions in Git's source code follow the convention that returning
> > a negative value indicates a fatal error, e.g. repository corruption.
> >
> > Let's use this convention in `repo_in_merge_bases()` to report when one
> > of the specified commits is missing (i.e. when `repo_parse_commit()`
> > reports an error).
> >
> > Also adjust the callers of `repo_in_merge_bases()` to handle such
> > negative return values.
>
> All of the above makes sense, but I have to wonder if this hunk
> should rather want to be part of the previous step:
>
> > @@ -486,10 +488,10 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
> >  	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
> >
> >  	if (repo_parse_commit(r, commit))
> > -		return ret;
> > +		return ignore_missing_commits ? 0 : -1;
> >  	for (i = 0; i < nr_reference; i++) {
> >  		if (repo_parse_commit(r, reference[i]))
> > -			return ret;
> > +			return ignore_missing_commits ? 0 : -1;
> >
> >  		generation = commit_graph_generation(reference[i]);
> >  		if (generation > max_generation)
>
> as this hunk is not about many callers of repo_in_merge_bases() that
> ignored the return values, which are all fixed by this patch, but
> about returning that error signal back to the caller.
>
> Yes, I know you wrote in [02/11] that it does not change the
> behaviour, and if you move this hunk to [02/11], it might change the
> behaviour, but that is changing for the better.

I wanted 2/11 to be trivial to review, and therefore specifically wanted
behavior not to change just yet: At least when I review patches, this
information helps me assess the correctness of the patch because I have a
different pair of glasses on, so to say.

> Besides, adding a parameter "ignore_missing" to the function only to be
> ignored until the next patch feels rather incomplete.

By that reasoning, the entire patch series should be squashed into a
single patch, as the missing commits will only be handled properly if all
11 patches are applied ;-)

Seriously again, I designed this patch series in a way where it builds up
incrementally, adding preparations here and there, in as easily reviewable
a shape as I could, until the final patch wraps everything in a bow.

I did this mostly to be able to convince myself of the correctness of the
patches because I sense such a vast opportunity for bugs to creep in.

> The other changes in this patch about its primary theme, fixing the
> callers that used to ignore return values of repo_in_merge_bases(),
> all looked sensible.  This hunk somehow stood out like a sore thumb
> to me.

I have an idea. How about pulling out this hunk into its own patch? And
insert it between 2/11 and 3/11? That would probably make most sense, as
it would make the patch series still (relatively) easy to review, and it
would not conflate the purpose of this hunk with the rest of 3/11's hunks.

What do you think?

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()`
  2024-02-26  9:34       ` Johannes Schindelin
@ 2024-02-26 20:01         ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2024-02-26 20:01 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Patrick Steinhardt

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I have an idea. How about pulling out this hunk into its own patch? And
> insert it between 2/11 and 3/11? That would probably make most sense, as
> it would make the patch series still (relatively) easy to review, and it
> would not conflate the purpose of this hunk with the rest of 3/11's hunks.
>
> What do you think?

It does not solve the primary issue that [2/11] feels incomplete and
makes readers wonder what the new parameter that is unused is about.
So unless you are squashing that small bit into [2/11], I do not think
such a churn is worth that much.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 00/11] The merge-base logic vs missing commit objects
  2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
                     ` (10 preceding siblings ...)
  2024-02-22 13:21   ` [PATCH v2 11/11] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28   ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
                       ` (11 more replies)
  11 siblings, 12 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin

This patch series is in the same spirit as what I proposed in
https://lore.kernel.org/git/pull.1651.git.1707212981.gitgitgadget@gmail.com/,
where I tackled missing tree objects. This here patch series tackles missing
commit objects instead: Git's merge-base logic handles these missing commit
objects as if there had not been any commit object at all, i.e. if two
commits' merge base commit is missing, they will be treated as if they had
no common commit history at all, which is a bug. Those commit objects should
not be missing, of course, i.e. this is only a problem in corrupt
repositories.

This patch series is a bit more tricky than the "missing tree objects" one,
though: The function signature of quite a few functions need to be changed
to allow callers to discern between "missing commit object" vs "no
merge-base found".

And of course it gets even more tricky because in shallow and partial clones
we expect commit objects to be missing, and that must not be treated like an
error but the existing logic is actually desirable in those scenarios.

I am deeply sorry both about the length of this patch series as well as
having to lean so heavily on reviews on the Git mailing list.

Changes since v2:

 * Moved a hunk from 3/11 to 2/11 that lets the repo_in_merge_bases_many()
   function return an error if non-existing commits have been passed to it,
   unless the ignore_missing_commits parameter is non-zero.
 * The end result is tree-same.

Changes since v1:

 * Addressed a lot of memory leaks.
 * Reordered patch 2 and 3 to render the commit message's comment about
   unchanged behavior true.
 * Fixed the incorrectly converted condition in the merge_submodule()
   function.
 * The last patch ("paint_down_to_common(): special-case shallow/partial
   clones") was dropped because it is not actually necessary, and the
   explanation for that was added to the commit message of "Prepare
   paint_down_to_common() for handling shallow commits".
 * An inconsistently-named variable i was renamed to be consistent with the
   other variables called ret.

Johannes Schindelin (11):
  paint_down_to_common: plug two memory leaks
  Prepare `repo_in_merge_bases_many()` to optionally expect missing
    commits
  Start reporting missing commits in `repo_in_merge_bases_many()`
  Prepare `paint_down_to_common()` for handling shallow commits
  commit-reach: start reporting errors in `paint_down_to_common()`
  merge_bases_many(): pass on errors from `paint_down_to_common()`
  get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases(): pass on errors from `merge_bases_many()`
  get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
  repo_get_merge_bases_many_dirty(): pass on errors from
    `merge_bases_many()`

 bisect.c                         |   7 +-
 builtin/branch.c                 |  12 +-
 builtin/fast-import.c            |   6 +-
 builtin/fetch.c                  |   2 +
 builtin/log.c                    |  30 +++--
 builtin/merge-base.c             |  23 +++-
 builtin/merge-tree.c             |   5 +-
 builtin/merge.c                  |  26 ++--
 builtin/pull.c                   |   9 +-
 builtin/rebase.c                 |   8 +-
 builtin/receive-pack.c           |   6 +-
 builtin/rev-parse.c              |   5 +-
 commit-reach.c                   | 209 ++++++++++++++++++++-----------
 commit-reach.h                   |  26 ++--
 commit.c                         |   7 +-
 diff-lib.c                       |   5 +-
 http-push.c                      |   5 +-
 log-tree.c                       |   5 +-
 merge-ort.c                      |  87 +++++++++++--
 merge-recursive.c                |  58 +++++++--
 notes-merge.c                    |   3 +-
 object-name.c                    |   7 +-
 remote.c                         |   2 +-
 revision.c                       |  12 +-
 sequencer.c                      |   8 +-
 shallow.c                        |  21 ++--
 submodule.c                      |   7 +-
 t/helper/test-reach.c            |  11 +-
 t/t4301-merge-tree-write-tree.sh |  12 ++
 29 files changed, 441 insertions(+), 183 deletions(-)


base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1657%2Fdscho%2Fmerge-base-and-missing-objects-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1657/dscho/merge-base-and-missing-objects-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1657

Range-diff vs v2:

  1:  6e4e409cd43 =  1:  6e4e409cd43 paint_down_to_common: plug two memory leaks
  2:  5c16becfb9b !  2:  f68d77c6123 Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
     @@ Commit message
          passed to `repo_in_merge_bases_many()` to trigger this behavior, and use
          it in the two callers in `shallow.c`.
      
     -    This does not change behavior in this commit, but prepares in an
     -    easy-to-review way for the upcoming changes that will make the merge
     -    base logic more stringent with regards to missing commit objects.
     +    This commit changes behavior slightly: unless called from the
     +    `shallow.c` functions that set the `ignore_missing_commits` bit, any
     +    non-existing tip commit that is passed to `repo_in_merge_bases_many()`
     +    will now result in an error.
     +
     +    Note: When encountering missing commits while traversing the commit
     +    history in search for merge bases, with this commit there won't be a
     +    change in behavior just yet, their children will still be interpreted as
     +    root commits. This bug will get fixed by follow-up commits.
      
          Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
      
     @@ commit-reach.c: int repo_is_descendant_of(struct repository *r,
       {
       	struct commit_list *bases;
       	int ret = 0, i;
     + 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
     + 
     + 	if (repo_parse_commit(r, commit))
     +-		return ret;
     ++		return ignore_missing_commits ? 0 : -1;
     + 	for (i = 0; i < nr_reference; i++) {
     + 		if (repo_parse_commit(r, reference[i]))
     +-			return ret;
     ++			return ignore_missing_commits ? 0 : -1;
     + 
     + 		generation = commit_graph_generation(reference[i]);
     + 		if (generation > max_generation)
      
       ## commit-reach.h ##
      @@ commit-reach.h: int repo_in_merge_bases(struct repository *r,
  3:  4dd214f91d4 !  3:  0aaa224b5db Start reporting missing commits in `repo_in_merge_bases_many()`
     @@ commit-reach.c: int repo_is_descendant_of(struct repository *r,
       		}
       		return 0;
       	}
     -@@ commit-reach.c: int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
     - 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
     - 
     - 	if (repo_parse_commit(r, commit))
     --		return ret;
     -+		return ignore_missing_commits ? 0 : -1;
     - 	for (i = 0; i < nr_reference; i++) {
     - 		if (repo_parse_commit(r, reference[i]))
     --			return ret;
     -+			return ignore_missing_commits ? 0 : -1;
     - 
     - 		generation = commit_graph_generation(reference[i]);
     - 		if (generation > max_generation)
      @@ commit-reach.c: int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid)
       	commit_list_insert(old_commit, &old_commit_list);
       	ret = repo_is_descendant_of(the_repository,
  4:  53bdeddb4cb =  4:  84e7fbc07e0 Prepare `paint_down_to_common()` for handling shallow commits
  5:  ec3ebf0ed17 =  5:  85332b58c37 commit-reach: start reporting errors in `paint_down_to_common()`
  6:  05756fbf71a =  6:  2ae6a54dd59 merge_bases_many(): pass on errors from `paint_down_to_common()`
  7:  e3d37a326e5 =  7:  4321795102d get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
  8:  9ca504525b9 =  8:  35545c4b777 repo_get_merge_bases(): pass on errors from `merge_bases_many()`
  9:  b11879edb73 =  9:  a963058d2ba get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
 10:  602a7383f72 = 10:  c3bed9c8500 repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
 11:  96850ed2d69 = 11:  bdbf47ae505 repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v3 01/11] paint_down_to_common: plug two memory leaks
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits Johannes Schindelin via GitGitGadget
                       ` (10 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When a commit is missing, we return early (currently pretending that no
merge basis could be found in that case). At that stage, it is possible
that a merge base could have been found already, and added to the
`result`, which is now leaked.

The priority queue has a similar issue: There might still be a commit in
that queue.

Let's release both, to address the potential memory leaks.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/commit-reach.c b/commit-reach.c
index a868a575ea1..7ea916f9ebd 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -105,8 +105,11 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 			parents = parents->next;
 			if ((p->object.flags & flags) == flags)
 				continue;
-			if (repo_parse_commit(r, p))
+			if (repo_parse_commit(r, p)) {
+				clear_prio_queue(&queue);
+				free_commit_list(result);
 				return NULL;
+			}
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
 		}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
                       ` (9 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Currently this function treats unrelated commit histories the same way
as commit histories with missing commit objects.

Typically, missing commit objects constitute a corrupt repository,
though, and should be reported as such. The next commits will make it
so, but there is one exception: In `git fetch --update-shallow` we
_expect_ commit objects to be missing, and we do want to treat the
now-incomplete commit histories as unrelated.

To allow for that, let's introduce an additional parameter that is
passed to `repo_in_merge_bases_many()` to trigger this behavior, and use
it in the two callers in `shallow.c`.

This commit changes behavior slightly: unless called from the
`shallow.c` functions that set the `ignore_missing_commits` bit, any
non-existing tip commit that is passed to `repo_in_merge_bases_many()`
will now result in an error.

Note: When encountering missing commits while traversing the commit
history in search for merge bases, with this commit there won't be a
change in behavior just yet, their children will still be interpreted as
root commits. This bug will get fixed by follow-up commits.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c        | 9 +++++----
 commit-reach.h        | 3 ++-
 remote.c              | 2 +-
 shallow.c             | 5 +++--
 t/helper/test-reach.c | 2 +-
 5 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 7ea916f9ebd..5c1b5256598 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -467,7 +467,7 @@ int repo_is_descendant_of(struct repository *r,
 
 			other = with_commit->item;
 			with_commit = with_commit->next;
-			if (repo_in_merge_bases_many(r, other, 1, &commit))
+			if (repo_in_merge_bases_many(r, other, 1, &commit, 0))
 				return 1;
 		}
 		return 0;
@@ -478,17 +478,18 @@ int repo_is_descendant_of(struct repository *r,
  * Is "commit" an ancestor of one of the "references"?
  */
 int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
-			     int nr_reference, struct commit **reference)
+			     int nr_reference, struct commit **reference,
+			     int ignore_missing_commits)
 {
 	struct commit_list *bases;
 	int ret = 0, i;
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
 	if (repo_parse_commit(r, commit))
-		return ret;
+		return ignore_missing_commits ? 0 : -1;
 	for (i = 0; i < nr_reference; i++) {
 		if (repo_parse_commit(r, reference[i]))
-			return ret;
+			return ignore_missing_commits ? 0 : -1;
 
 		generation = commit_graph_generation(reference[i]);
 		if (generation > max_generation)
diff --git a/commit-reach.h b/commit-reach.h
index 35c4da49481..68f81549a44 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -30,7 +30,8 @@ int repo_in_merge_bases(struct repository *r,
 			struct commit *reference);
 int repo_in_merge_bases_many(struct repository *r,
 			     struct commit *commit,
-			     int nr_reference, struct commit **reference);
+			     int nr_reference, struct commit **reference,
+			     int ignore_missing_commits);
 
 /*
  * Takes a list of commits and returns a new list where those
diff --git a/remote.c b/remote.c
index abb24822beb..763c80f4a7d 100644
--- a/remote.c
+++ b/remote.c
@@ -2675,7 +2675,7 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
 		if (MERGE_BASES_BATCH_SIZE < size)
 			size = MERGE_BASES_BATCH_SIZE;
 
-		if ((ret = repo_in_merge_bases_many(the_repository, commit, size, chunk)))
+		if ((ret = repo_in_merge_bases_many(the_repository, commit, size, chunk, 0)))
 			break;
 	}
 
diff --git a/shallow.c b/shallow.c
index ac728cdd778..dfcc1f86a7f 100644
--- a/shallow.c
+++ b/shallow.c
@@ -797,7 +797,7 @@ static void post_assign_shallow(struct shallow_info *info,
 		for (j = 0; j < bitmap_nr; j++)
 			if (bitmap[0][j] &&
 			    /* Step 7, reachability test at commit level */
-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits)) {
+			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1)) {
 				update_refstatus(ref_status, info->ref->nr, *bitmap);
 				dst++;
 				break;
@@ -828,7 +828,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 		si->reachable[c] = repo_in_merge_bases_many(the_repository,
 							    commit,
 							    si->nr_commits,
-							    si->commits);
+							    si->commits,
+							    1);
 		si->need_reachability_test[c] = 0;
 	}
 	return si->reachable[c];
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index 3e173399a00..aa816e168ea 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -113,7 +113,7 @@ int cmd__reach(int ac, const char **av)
 		       repo_in_merge_bases(the_repository, A, B));
 	else if (!strcmp(av[1], "in_merge_bases_many"))
 		printf("%s(A,X):%d\n", av[1],
-		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array));
+		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array, 0));
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 03/11] Start reporting missing commits in `repo_in_merge_bases_many()`
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
                       ` (8 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Some functions in Git's source code follow the convention that returning
a negative value indicates a fatal error, e.g. repository corruption.

Let's use this convention in `repo_in_merge_bases()` to report when one
of the specified commits is missing (i.e. when `repo_parse_commit()`
reports an error).

Also adjust the callers of `repo_in_merge_bases()` to handle such
negative return values.

Note: As of this patch, errors are returned only if any of the specified
merge heads is missing. Over the course of the next patches, missing
commits will also be reported by the `paint_down_to_common()` function,
which is called by `repo_in_merge_bases_many()`, and those errors will
be properly propagated back to the caller at that stage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/branch.c       | 12 +++++--
 builtin/fast-import.c  |  6 +++-
 builtin/fetch.c        |  2 ++
 builtin/log.c          |  7 ++--
 builtin/merge-base.c   |  6 +++-
 builtin/pull.c         |  4 +++
 builtin/receive-pack.c |  6 +++-
 commit-reach.c         |  8 +++--
 http-push.c            |  5 ++-
 merge-ort.c            | 81 ++++++++++++++++++++++++++++++++++++------
 merge-recursive.c      | 54 +++++++++++++++++++++++-----
 shallow.c              | 18 ++++++----
 12 files changed, 173 insertions(+), 36 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index e7ee9bd0f15..7f9e79237f3 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -161,6 +161,8 @@ static int branch_merged(int kind, const char *name,
 
 	merged = reference_rev ? repo_in_merge_bases(the_repository, rev,
 						     reference_rev) : 0;
+	if (merged < 0)
+		exit(128);
 
 	/*
 	 * After the safety valve is fully redefined to "check with
@@ -169,9 +171,13 @@ static int branch_merged(int kind, const char *name,
 	 * any of the following code, but during the transition period,
 	 * a gentle reminder is in order.
 	 */
-	if ((head_rev != reference_rev) &&
-	    (head_rev ? repo_in_merge_bases(the_repository, rev, head_rev) : 0) != merged) {
-		if (merged)
+	if (head_rev != reference_rev) {
+		int expect = head_rev ? repo_in_merge_bases(the_repository, rev, head_rev) : 0;
+		if (expect < 0)
+			exit(128);
+		if (expect == merged)
+			; /* okay */
+		else if (merged)
 			warning(_("deleting branch '%s' that has been merged to\n"
 				"         '%s', but not yet merged to HEAD"),
 				name, reference_name);
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 444f41cf8ca..14c2efa88fc 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -1625,6 +1625,7 @@ static int update_branch(struct branch *b)
 		oidclr(&old_oid);
 	if (!force_update && !is_null_oid(&old_oid)) {
 		struct commit *old_cmit, *new_cmit;
+		int ret;
 
 		old_cmit = lookup_commit_reference_gently(the_repository,
 							  &old_oid, 0);
@@ -1633,7 +1634,10 @@ static int update_branch(struct branch *b)
 		if (!old_cmit || !new_cmit)
 			return error("Branch %s is missing commits.", b->name);
 
-		if (!repo_in_merge_bases(the_repository, old_cmit, new_cmit)) {
+		ret = repo_in_merge_bases(the_repository, old_cmit, new_cmit);
+		if (ret < 0)
+			exit(128);
+		if (!ret) {
 			warning("Not updating %s"
 				" (new tip %s does not contain %s)",
 				b->name, oid_to_hex(&b->oid),
diff --git a/builtin/fetch.c b/builtin/fetch.c
index fd134ba74d9..0584a1f8b64 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -978,6 +978,8 @@ static int update_local_ref(struct ref *ref,
 		uint64_t t_before = getnanotime();
 		fast_forward = repo_in_merge_bases(the_repository, current,
 						   updated);
+		if (fast_forward < 0)
+			exit(128);
 		forced_updates_ms += (getnanotime() - t_before) / 1000000;
 	} else {
 		fast_forward = 1;
diff --git a/builtin/log.c b/builtin/log.c
index ba775d7b5cf..1705da71aca 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1623,7 +1623,7 @@ static struct commit *get_base_commit(const char *base_commit,
 {
 	struct commit *base = NULL;
 	struct commit **rev;
-	int i = 0, rev_nr = 0, auto_select, die_on_failure;
+	int i = 0, rev_nr = 0, auto_select, die_on_failure, ret;
 
 	switch (auto_base) {
 	case AUTO_BASE_NEVER:
@@ -1723,7 +1723,10 @@ static struct commit *get_base_commit(const char *base_commit,
 		rev_nr = DIV_ROUND_UP(rev_nr, 2);
 	}
 
-	if (!repo_in_merge_bases(the_repository, base, rev[0])) {
+	ret = repo_in_merge_bases(the_repository, base, rev[0]);
+	if (ret < 0)
+		exit(128);
+	if (!ret) {
 		if (die_on_failure) {
 			die(_("base commit should be the ancestor of revision list"));
 		} else {
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index e68b7fe45d7..0308fd73289 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -103,12 +103,16 @@ static int handle_octopus(int count, const char **args, int show_all)
 static int handle_is_ancestor(int argc, const char **argv)
 {
 	struct commit *one, *two;
+	int ret;
 
 	if (argc != 2)
 		die("--is-ancestor takes exactly two commits");
 	one = get_commit_reference(argv[0]);
 	two = get_commit_reference(argv[1]);
-	if (repo_in_merge_bases(the_repository, one, two))
+	ret = repo_in_merge_bases(the_repository, one, two);
+	if (ret < 0)
+		exit(128);
+	if (ret)
 		return 0;
 	else
 		return 1;
diff --git a/builtin/pull.c b/builtin/pull.c
index be2b2c9ebc9..e6f2942c0c5 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -931,6 +931,8 @@ static int get_can_ff(struct object_id *orig_head,
 	merge_head = lookup_commit_reference(the_repository, orig_merge_head);
 	ret = repo_is_descendant_of(the_repository, merge_head, list);
 	free_commit_list(list);
+	if (ret < 0)
+		exit(128);
 	return ret;
 }
 
@@ -955,6 +957,8 @@ static int already_up_to_date(struct object_id *orig_head,
 		commit_list_insert(theirs, &list);
 		ok = repo_is_descendant_of(the_repository, ours, list);
 		free_commit_list(list);
+		if (ok < 0)
+			exit(128);
 		if (!ok)
 			return 0;
 	}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8c4f0cb90a9..956fea6293e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1546,6 +1546,7 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 	    starts_with(name, "refs/heads/")) {
 		struct object *old_object, *new_object;
 		struct commit *old_commit, *new_commit;
+		int ret2;
 
 		old_object = parse_object(the_repository, old_oid);
 		new_object = parse_object(the_repository, new_oid);
@@ -1559,7 +1560,10 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 		}
 		old_commit = (struct commit *)old_object;
 		new_commit = (struct commit *)new_object;
-		if (!repo_in_merge_bases(the_repository, old_commit, new_commit)) {
+		ret2 = repo_in_merge_bases(the_repository, old_commit, new_commit);
+		if (ret2 < 0)
+			exit(128);
+		if (!ret2) {
 			rp_error("denying non-fast-forward %s"
 				 " (you should pull first)", name);
 			ret = "non-fast-forward";
diff --git a/commit-reach.c b/commit-reach.c
index 5c1b5256598..5ff71d72d51 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -464,11 +464,13 @@ int repo_is_descendant_of(struct repository *r,
 	} else {
 		while (with_commit) {
 			struct commit *other;
+			int ret;
 
 			other = with_commit->item;
 			with_commit = with_commit->next;
-			if (repo_in_merge_bases_many(r, other, 1, &commit, 0))
-				return 1;
+			ret = repo_in_merge_bases_many(r, other, 1, &commit, 0);
+			if (ret)
+				return ret;
 		}
 		return 0;
 	}
@@ -598,6 +600,8 @@ int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid)
 	commit_list_insert(old_commit, &old_commit_list);
 	ret = repo_is_descendant_of(the_repository,
 				    new_commit, old_commit_list);
+	if (ret < 0)
+		exit(128);
 	free_commit_list(old_commit_list);
 	return ret;
 }
diff --git a/http-push.c b/http-push.c
index a704f490fdb..24c16a4f5ff 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1576,8 +1576,11 @@ static int verify_merge_base(struct object_id *head_oid, struct ref *remote)
 	struct commit *head = lookup_commit_or_die(head_oid, "HEAD");
 	struct commit *branch = lookup_commit_or_die(&remote->old_oid,
 						     remote->name);
+	int ret = repo_in_merge_bases(the_repository, branch, head);
 
-	return repo_in_merge_bases(the_repository, branch, head);
+	if (ret < 0)
+		exit(128);
+	return ret;
 }
 
 static int delete_remote_branch(const char *pattern, int force)
diff --git a/merge-ort.c b/merge-ort.c
index 6491070d965..9f3af46333a 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -544,6 +544,7 @@ enum conflict_and_info_types {
 	CONFLICT_SUBMODULE_HISTORY_NOT_AVAILABLE,
 	CONFLICT_SUBMODULE_MAY_HAVE_REWINDS,
 	CONFLICT_SUBMODULE_NULL_MERGE_BASE,
+	CONFLICT_SUBMODULE_CORRUPT,
 
 	/* Keep this entry _last_ in the list */
 	NB_CONFLICT_TYPES,
@@ -596,7 +597,9 @@ static const char *type_short_descriptions[] = {
 	[CONFLICT_SUBMODULE_MAY_HAVE_REWINDS] =
 		"CONFLICT (submodule may have rewinds)",
 	[CONFLICT_SUBMODULE_NULL_MERGE_BASE] =
-		"CONFLICT (submodule lacks merge base)"
+		"CONFLICT (submodule lacks merge base)",
+	[CONFLICT_SUBMODULE_CORRUPT] =
+		"CONFLICT (submodule corrupt)"
 };
 
 struct logical_conflict_info {
@@ -1710,7 +1713,14 @@ static int find_first_merges(struct repository *repo,
 		die("revision walk setup failed");
 	while ((commit = get_revision(&revs)) != NULL) {
 		struct object *o = &(commit->object);
-		if (repo_in_merge_bases(repo, b, commit))
+		int ret = repo_in_merge_bases(repo, b, commit);
+
+		if (ret < 0) {
+			object_array_clear(&merges);
+			release_revisions(&revs);
+			return ret;
+		}
+		if (ret > 0)
 			add_object_array(o, NULL, &merges);
 	}
 	reset_revision_walk();
@@ -1725,9 +1735,17 @@ static int find_first_merges(struct repository *repo,
 		contains_another = 0;
 		for (j = 0; j < merges.nr; j++) {
 			struct commit *m2 = (struct commit *) merges.objects[j].item;
-			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
-				contains_another = 1;
-				break;
+			if (i != j) {
+				int ret = repo_in_merge_bases(repo, m2, m1);
+				if (ret < 0) {
+					object_array_clear(&merges);
+					release_revisions(&revs);
+					return ret;
+				}
+				if (ret > 0) {
+					contains_another = 1;
+					break;
+				}
 			}
 		}
 
@@ -1749,7 +1767,7 @@ static int merge_submodule(struct merge_options *opt,
 {
 	struct repository subrepo;
 	struct strbuf sb = STRBUF_INIT;
-	int ret = 0;
+	int ret = 0, ret2;
 	struct commit *commit_o, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
@@ -1796,8 +1814,26 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* check whether both changes are forward */
-	if (!repo_in_merge_bases(&subrepo, commit_o, commit_a) ||
-	    !repo_in_merge_bases(&subrepo, commit_o, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_a);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0)
+		ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_b);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (!ret2) {
 		path_msg(opt, CONFLICT_SUBMODULE_MAY_HAVE_REWINDS, 0,
 			 path, NULL, NULL, NULL,
 			 _("Failed to merge submodule %s "
@@ -1807,7 +1843,16 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* Case #1: a is contained in b or vice versa */
-	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0) {
 		oidcpy(result, b);
 		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
 			 path, NULL, NULL, NULL,
@@ -1816,7 +1861,16 @@ static int merge_submodule(struct merge_options *opt,
 		ret = 1;
 		goto cleanup;
 	}
-	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0) {
 		oidcpy(result, a);
 		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
 			 path, NULL, NULL, NULL,
@@ -1841,6 +1895,13 @@ static int merge_submodule(struct merge_options *opt,
 	parent_count = find_first_merges(&subrepo, path, commit_a, commit_b,
 					 &merges);
 	switch (parent_count) {
+	case -1:
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		break;
 	case 0:
 		path_msg(opt, CONFLICT_SUBMODULE_FAILED_TO_MERGE, 0,
 			 path, NULL, NULL, NULL,
diff --git a/merge-recursive.c b/merge-recursive.c
index e3beb0801b1..0d931cc14ad 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1144,7 +1144,13 @@ static int find_first_merges(struct repository *repo,
 		die("revision walk setup failed");
 	while ((commit = get_revision(&revs)) != NULL) {
 		struct object *o = &(commit->object);
-		if (repo_in_merge_bases(repo, b, commit))
+		int ret = repo_in_merge_bases(repo, b, commit);
+		if (ret < 0) {
+			object_array_clear(&merges);
+			release_revisions(&revs);
+			return ret;
+		}
+		if (ret)
 			add_object_array(o, NULL, &merges);
 	}
 	reset_revision_walk();
@@ -1159,9 +1165,17 @@ static int find_first_merges(struct repository *repo,
 		contains_another = 0;
 		for (j = 0; j < merges.nr; j++) {
 			struct commit *m2 = (struct commit *) merges.objects[j].item;
-			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
-				contains_another = 1;
-				break;
+			if (i != j) {
+				int ret = repo_in_merge_bases(repo, m2, m1);
+				if (ret < 0) {
+					object_array_clear(&merges);
+					release_revisions(&revs);
+					return ret;
+				}
+				if (ret > 0) {
+					contains_another = 1;
+					break;
+				}
 			}
 		}
 
@@ -1197,7 +1211,7 @@ static int merge_submodule(struct merge_options *opt,
 			   const struct object_id *b)
 {
 	struct repository subrepo;
-	int ret = 0;
+	int ret = 0, ret2;
 	struct commit *commit_base, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
@@ -1234,14 +1248,29 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* check whether both changes are forward */
-	if (!repo_in_merge_bases(&subrepo, commit_base, commit_a) ||
-	    !repo_in_merge_bases(&subrepo, commit_base, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_a);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2 > 0)
+		ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_b);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (!ret2) {
 		output(opt, 1, _("Failed to merge submodule %s (commits don't follow merge-base)"), path);
 		goto cleanup;
 	}
 
 	/* Case #1: a is contained in b or vice versa */
-	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2) {
 		oidcpy(result, b);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
@@ -1254,7 +1283,12 @@ static int merge_submodule(struct merge_options *opt,
 		ret = 1;
 		goto cleanup;
 	}
-	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2) {
 		oidcpy(result, a);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
@@ -1402,6 +1436,8 @@ static int merge_mode_and_contents(struct merge_options *opt,
 							&o->oid,
 							&a->oid,
 							&b->oid);
+			if (result->clean < 0)
+				return -1;
 		} else if (S_ISLNK(a->mode)) {
 			switch (opt->recursive_variant) {
 			case MERGE_VARIANT_NORMAL:
diff --git a/shallow.c b/shallow.c
index dfcc1f86a7f..f71496f35c3 100644
--- a/shallow.c
+++ b/shallow.c
@@ -795,12 +795,16 @@ static void post_assign_shallow(struct shallow_info *info,
 		if (!*bitmap)
 			continue;
 		for (j = 0; j < bitmap_nr; j++)
-			if (bitmap[0][j] &&
-			    /* Step 7, reachability test at commit level */
-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1)) {
-				update_refstatus(ref_status, info->ref->nr, *bitmap);
-				dst++;
-				break;
+			if (bitmap[0][j]) {
+				/* Step 7, reachability test at commit level */
+				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1);
+				if (ret < 0)
+					exit(128);
+				if (!ret) {
+					update_refstatus(ref_status, info->ref->nr, *bitmap);
+					dst++;
+					break;
+				}
 			}
 	}
 	info->nr_ours = dst;
@@ -830,6 +834,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 							    si->nr_commits,
 							    si->commits,
 							    1);
+		if (si->reachable[c] < 0)
+			exit(128);
 		si->need_reachability_test[c] = 0;
 	}
 	return si->reachable[c];
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (2 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 14:46       ` Dirk Gouders
  2024-02-27 13:28     ` [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
                       ` (7 subsequent siblings)
  11 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When `git fetch --update-shallow` needs to test for commit ancestry, it
can naturally run into a missing object (e.g. if it is a parent of a
shallow commit). To accommodate, the merge base logic will need to be
able to handle this situation gracefully.

Currently, that logic pretends that a missing parent commit is
equivalent to a missing parent commit, and for the purpose of
`--update-shallow` that is exactly what we need it to do.

Therefore, let's introduce a flag to indicate when that is precisely the
logic we want.

We need a flag, and cannot rely on `is_repository_shallow()` to indicate
that situation, because that function would return 0: There may not
actually be a `shallow` file, as demonstrated e.g. by t5537.10 ("add new
shallow root with receive.updateshallow on") and t5538.4 ("add new
shallow root with receive.updateshallow on").

Note: shallow commits' parents are set to `NULL` internally already,
therefore there is no need to special-case shallow repositories here, as
the merge-base logic will not try to access parent commits of shallow
commits.

Likewise, partial clones aren't an issue either: If a commit is missing
during the revision walk in the merge-base logic, it is fetched via
`promisor_remote_get_direct()`. And not only the single missing commit
object: Due to the way the "promised" objects are fetched (in
`fetch_objects()` in `promisor-remote.c`, using `fetch
--filter=blob:none`), there is no actual way to fetch a single commit
object, as the remote side will pass that commit OID to `pack-objects
--revs [...]` which in turn passes it to `rev-list` which interprets
this as a commit _range_ instead of a single object. Therefore, in
partial clones (unless they are shallow in addition), all commits
reachable from a commit that is in the local object database are also
present in that local database.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 5ff71d72d51..7112b10eeea 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -53,7 +53,8 @@ static int queue_has_nonstale(struct prio_queue *queue)
 static struct commit_list *paint_down_to_common(struct repository *r,
 						struct commit *one, int n,
 						struct commit **twos,
-						timestamp_t min_generation)
+						timestamp_t min_generation,
+						int ignore_missing_commits)
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *result = NULL;
@@ -108,6 +109,13 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 			if (repo_parse_commit(r, p)) {
 				clear_prio_queue(&queue);
 				free_commit_list(result);
+				/*
+				 * At this stage, we know that the commit is
+				 * missing: `repo_parse_commit()` uses
+				 * `OBJECT_INFO_DIE_IF_CORRUPT` and therefore
+				 * corrupt commits would already have been
+				 * dispatched with a `die()`.
+				 */
 				return NULL;
 			}
 			p->object.flags |= flags;
@@ -143,7 +151,7 @@ static struct commit_list *merge_bases_many(struct repository *r,
 			return NULL;
 	}
 
-	list = paint_down_to_common(r, one, n, twos, 0);
+	list = paint_down_to_common(r, one, n, twos, 0, 0);
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -214,7 +222,7 @@ static int remove_redundant_no_gen(struct repository *r,
 				min_generation = curr_generation;
 		}
 		common = paint_down_to_common(r, array[i], filled,
-					      work, min_generation);
+					      work, min_generation, 0);
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -504,7 +512,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 
 	bases = paint_down_to_common(r, commit,
 				     nr_reference, reference,
-				     generation);
+				     generation, ignore_missing_commits);
 	if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()`
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (3 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 14:56       ` Dirk Gouders
  2024-02-27 13:28     ` [PATCH v3 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
                       ` (6 subsequent siblings)
  11 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

If a commit cannot be parsed, it is currently ignored when looking for
merge bases. That's undesirable as the operation can pretend success in
a corrupt repository, even though the command should fail with an error
message.

Let's start at the bottom of the stack by teaching the
`paint_down_to_common()` function to return an `int`: if negative, it
indicates fatal error, if 0 success.

This requires a couple of callers to be adjusted accordingly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 66 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 45 insertions(+), 21 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 7112b10eeea..9148a7dcbc0 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -50,14 +50,14 @@ static int queue_has_nonstale(struct prio_queue *queue)
 }
 
 /* all input commits in one and twos[] must have been parsed! */
-static struct commit_list *paint_down_to_common(struct repository *r,
-						struct commit *one, int n,
-						struct commit **twos,
-						timestamp_t min_generation,
-						int ignore_missing_commits)
+static int paint_down_to_common(struct repository *r,
+				struct commit *one, int n,
+				struct commit **twos,
+				timestamp_t min_generation,
+				int ignore_missing_commits,
+				struct commit_list **result)
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
-	struct commit_list *result = NULL;
 	int i;
 	timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
 
@@ -66,8 +66,8 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 
 	one->object.flags |= PARENT1;
 	if (!n) {
-		commit_list_append(one, &result);
-		return result;
+		commit_list_append(one, result);
+		return 0;
 	}
 	prio_queue_put(&queue, one);
 
@@ -95,7 +95,7 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 		if (flags == (PARENT1 | PARENT2)) {
 			if (!(commit->object.flags & RESULT)) {
 				commit->object.flags |= RESULT;
-				commit_list_insert_by_date(commit, &result);
+				commit_list_insert_by_date(commit, result);
 			}
 			/* Mark parents of a found merge stale */
 			flags |= STALE;
@@ -108,7 +108,8 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 				continue;
 			if (repo_parse_commit(r, p)) {
 				clear_prio_queue(&queue);
-				free_commit_list(result);
+				free_commit_list(*result);
+				*result = NULL;
 				/*
 				 * At this stage, we know that the commit is
 				 * missing: `repo_parse_commit()` uses
@@ -116,7 +117,10 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 				 * corrupt commits would already have been
 				 * dispatched with a `die()`.
 				 */
-				return NULL;
+				if (ignore_missing_commits)
+					return 0;
+				return error(_("could not parse commit %s"),
+					     oid_to_hex(&p->object.oid));
 			}
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
@@ -124,7 +128,7 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 	}
 
 	clear_prio_queue(&queue);
-	return result;
+	return 0;
 }
 
 static struct commit_list *merge_bases_many(struct repository *r,
@@ -151,7 +155,10 @@ static struct commit_list *merge_bases_many(struct repository *r,
 			return NULL;
 	}
 
-	list = paint_down_to_common(r, one, n, twos, 0, 0);
+	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
+		free_commit_list(list);
+		return NULL;
+	}
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -205,7 +212,7 @@ static int remove_redundant_no_gen(struct repository *r,
 	for (i = 0; i < cnt; i++)
 		repo_parse_commit(r, array[i]);
 	for (i = 0; i < cnt; i++) {
-		struct commit_list *common;
+		struct commit_list *common = NULL;
 		timestamp_t min_generation = commit_graph_generation(array[i]);
 
 		if (redundant[i])
@@ -221,8 +228,16 @@ static int remove_redundant_no_gen(struct repository *r,
 			if (curr_generation < min_generation)
 				min_generation = curr_generation;
 		}
-		common = paint_down_to_common(r, array[i], filled,
-					      work, min_generation, 0);
+		if (paint_down_to_common(r, array[i], filled,
+					 work, min_generation, 0, &common)) {
+			clear_commit_marks(array[i], all_flags);
+			clear_commit_marks_many(filled, work, all_flags);
+			free_commit_list(common);
+			free(work);
+			free(redundant);
+			free(filled_index);
+			return -1;
+		}
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -422,6 +437,10 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	clear_commit_marks_many(n, twos, all_flags);
 
 	cnt = remove_redundant(r, rslt, cnt);
+	if (cnt < 0) {
+		free(rslt);
+		return NULL;
+	}
 	result = NULL;
 	for (i = 0; i < cnt; i++)
 		commit_list_insert_by_date(rslt[i], &result);
@@ -491,7 +510,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 			     int nr_reference, struct commit **reference,
 			     int ignore_missing_commits)
 {
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	int ret = 0, i;
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
@@ -510,10 +529,11 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 	if (generation > max_generation)
 		return ret;
 
-	bases = paint_down_to_common(r, commit,
-				     nr_reference, reference,
-				     generation, ignore_missing_commits);
-	if (commit->object.flags & PARENT2)
+	if (paint_down_to_common(r, commit,
+				 nr_reference, reference,
+				 generation, ignore_missing_commits, &bases))
+		ret = -1;
+	else if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
 	clear_commit_marks_many(nr_reference, reference, all_flags);
@@ -566,6 +586,10 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 		}
 	}
 	num_head = remove_redundant(the_repository, array, num_head);
+	if (num_head < 0) {
+		free(array);
+		return NULL;
+	}
 	for (i = 0; i < num_head; i++)
 		tail = &commit_list_insert(array[i], tail)->next;
 	free(array);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()`
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (4 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 18:29       ` Junio C Hamano
  2024-02-27 13:28     ` [PATCH v3 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
                       ` (5 subsequent siblings)
  11 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `paint_down_to_common()` function was just taught to indicate
parsing errors, and now the `merge_bases_many()` function is aware of
that, too.

One tricky aspect is that `merge_bases_many()` parses commits of its
own, but wants to gracefully handle the scenario where NULL is passed as
a merge head, returning the empty list of merge bases. The way this was
handled involved calling `repo_parse_commit(NULL)` and relying on it to
return an error. This has to be done differently now so that we can
handle missing commits correctly by producing a fatal error.

Next step: adjust the caller of `merge_bases_many()`:
`get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 9148a7dcbc0..2c74583c8e0 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -131,41 +131,49 @@ static int paint_down_to_common(struct repository *r,
 	return 0;
 }
 
-static struct commit_list *merge_bases_many(struct repository *r,
-					    struct commit *one, int n,
-					    struct commit **twos)
+static int merge_bases_many(struct repository *r,
+			    struct commit *one, int n,
+			    struct commit **twos,
+			    struct commit_list **result)
 {
 	struct commit_list *list = NULL;
-	struct commit_list *result = NULL;
 	int i;
 
 	for (i = 0; i < n; i++) {
-		if (one == twos[i])
+		if (one == twos[i]) {
 			/*
 			 * We do not mark this even with RESULT so we do not
 			 * have to clean it up.
 			 */
-			return commit_list_insert(one, &result);
+			*result = commit_list_insert(one, result);
+			return 0;
+		}
 	}
 
+	if (!one)
+		return 0;
 	if (repo_parse_commit(r, one))
-		return NULL;
+		return error(_("could not parse commit %s"),
+			     oid_to_hex(&one->object.oid));
 	for (i = 0; i < n; i++) {
+		if (!twos[i])
+			return 0;
 		if (repo_parse_commit(r, twos[i]))
-			return NULL;
+			return error(_("could not parse commit %s"),
+				     oid_to_hex(&twos[i]->object.oid));
 	}
 
 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
 		free_commit_list(list);
-		return NULL;
+		return -1;
 	}
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
 		if (!(commit->object.flags & STALE))
-			commit_list_insert_by_date(commit, &result);
+			commit_list_insert_by_date(commit, result);
 	}
-	return result;
+	return 0;
 }
 
 struct commit_list *get_octopus_merge_bases(struct commit_list *in)
@@ -410,10 +418,11 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 {
 	struct commit_list *list;
 	struct commit **rslt;
-	struct commit_list *result;
+	struct commit_list *result = NULL;
 	int cnt, i;
 
-	result = merge_bases_many(r, one, n, twos);
+	if (merge_bases_many(r, one, n, twos, &result) < 0)
+		return NULL;
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
 			return result;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (5 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 08/11] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
                       ` (4 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate
parsing errors, and now the `get_merge_bases_many_0()` function is aware
of that, too.

Next step: adjust the callers of `get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 57 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 2c74583c8e0..5fa0abc7d1e 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -410,37 +410,38 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt
 	return remove_redundant_no_gen(r, array, cnt);
 }
 
-static struct commit_list *get_merge_bases_many_0(struct repository *r,
-						  struct commit *one,
-						  int n,
-						  struct commit **twos,
-						  int cleanup)
+static int get_merge_bases_many_0(struct repository *r,
+				  struct commit *one,
+				  int n,
+				  struct commit **twos,
+				  int cleanup,
+				  struct commit_list **result)
 {
 	struct commit_list *list;
 	struct commit **rslt;
-	struct commit_list *result = NULL;
 	int cnt, i;
 
-	if (merge_bases_many(r, one, n, twos, &result) < 0)
-		return NULL;
+	if (merge_bases_many(r, one, n, twos, result) < 0)
+		return -1;
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
-			return result;
+			return 0;
 	}
-	if (!result || !result->next) {
+	if (!*result || !(*result)->next) {
 		if (cleanup) {
 			clear_commit_marks(one, all_flags);
 			clear_commit_marks_many(n, twos, all_flags);
 		}
-		return result;
+		return 0;
 	}
 
 	/* There are more than one */
-	cnt = commit_list_count(result);
+	cnt = commit_list_count(*result);
 	CALLOC_ARRAY(rslt, cnt);
-	for (list = result, i = 0; list; list = list->next)
+	for (list = *result, i = 0; list; list = list->next)
 		rslt[i++] = list->item;
-	free_commit_list(result);
+	free_commit_list(*result);
+	*result = NULL;
 
 	clear_commit_marks(one, all_flags);
 	clear_commit_marks_many(n, twos, all_flags);
@@ -448,13 +449,12 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	cnt = remove_redundant(r, rslt, cnt);
 	if (cnt < 0) {
 		free(rslt);
-		return NULL;
+		return -1;
 	}
-	result = NULL;
 	for (i = 0; i < cnt; i++)
-		commit_list_insert_by_date(rslt[i], &result);
+		commit_list_insert_by_date(rslt[i], result);
 	free(rslt);
-	return result;
+	return 0;
 }
 
 struct commit_list *repo_get_merge_bases_many(struct repository *r,
@@ -462,7 +462,12 @@ struct commit_list *repo_get_merge_bases_many(struct repository *r,
 					      int n,
 					      struct commit **twos)
 {
-	return get_merge_bases_many_0(r, one, n, twos, 1);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, n, twos, 1, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
@@ -470,14 +475,24 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    int n,
 						    struct commit **twos)
 {
-	return get_merge_bases_many_0(r, one, n, twos, 0);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, n, twos, 0, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 struct commit_list *repo_get_merge_bases(struct repository *r,
 					 struct commit *one,
 					 struct commit *two)
 {
-	return get_merge_bases_many_0(r, one, 1, &two, 1);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, 1, &two, 1, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 /*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 08/11] repo_get_merge_bases(): pass on errors from `merge_bases_many()`
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (6 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 09/11] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
                       ` (3 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `repo_get_merge_bases()` macro) is aware of that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next step: adjust the callers of `get_octopus_merge_bases()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/log.c                    | 10 +++++-----
 builtin/merge-tree.c             |  5 +++--
 builtin/merge.c                  | 20 ++++++++++++--------
 builtin/rebase.c                 |  8 +++++---
 builtin/rev-parse.c              |  5 +++--
 commit-reach.c                   | 23 +++++++++++------------
 commit-reach.h                   |  7 ++++---
 diff-lib.c                       |  5 +++--
 log-tree.c                       |  5 +++--
 merge-ort.c                      |  6 +++++-
 merge-recursive.c                |  4 +++-
 notes-merge.c                    |  3 ++-
 object-name.c                    |  7 +++++--
 revision.c                       | 12 ++++++++----
 sequencer.c                      |  8 ++++++--
 submodule.c                      |  7 ++++++-
 t/t4301-merge-tree-write-tree.sh | 12 ++++++++++++
 17 files changed, 96 insertions(+), 51 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 1705da71aca..befafd6ae04 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1702,11 +1702,11 @@ static struct commit *get_base_commit(const char *base_commit,
 	 */
 	while (rev_nr > 1) {
 		for (i = 0; i < rev_nr / 2; i++) {
-			struct commit_list *merge_base;
-			merge_base = repo_get_merge_bases(the_repository,
-							  rev[2 * i],
-							  rev[2 * i + 1]);
-			if (!merge_base || merge_base->next) {
+			struct commit_list *merge_base = NULL;
+			if (repo_get_merge_bases(the_repository,
+						 rev[2 * i],
+						 rev[2 * i + 1], &merge_base) < 0 ||
+			    !merge_base || merge_base->next) {
 				if (die_on_failure) {
 					die(_("failed to find exact merge base"));
 				} else {
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index a35e0452d66..76200250629 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -463,8 +463,9 @@ static int real_merge(struct merge_tree_options *o,
 		 * Get the merge bases, in reverse order; see comment above
 		 * merge_incore_recursive in merge-ort.h
 		 */
-		merge_bases = repo_get_merge_bases(the_repository, parent1,
-						   parent2);
+		if (repo_get_merge_bases(the_repository, parent1,
+					 parent2, &merge_bases) < 0)
+			exit(128);
 		if (!merge_bases && !o->allow_unrelated_histories)
 			die(_("refusing to merge unrelated histories"));
 		merge_bases = reverse_commit_list(merge_bases);
diff --git a/builtin/merge.c b/builtin/merge.c
index d748d46e135..ac9d58adc29 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1517,10 +1517,13 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 
 	if (!remoteheads)
 		; /* already up-to-date */
-	else if (!remoteheads->next)
-		common = repo_get_merge_bases(the_repository, head_commit,
-					      remoteheads->item);
-	else {
+	else if (!remoteheads->next) {
+		if (repo_get_merge_bases(the_repository, head_commit,
+					 remoteheads->item, &common) < 0) {
+			ret = 2;
+			goto done;
+		}
+	} else {
 		struct commit_list *list = remoteheads;
 		commit_list_insert(head_commit, &list);
 		common = get_octopus_merge_bases(list);
@@ -1631,7 +1634,7 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 		struct commit_list *j;
 
 		for (j = remoteheads; j; j = j->next) {
-			struct commit_list *common_one;
+			struct commit_list *common_one = NULL;
 			struct commit *common_item;
 
 			/*
@@ -1639,9 +1642,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 			 * merge_bases again, otherwise "git merge HEAD^
 			 * HEAD^^" would be missed.
 			 */
-			common_one = repo_get_merge_bases(the_repository,
-							  head_commit,
-							  j->item);
+			if (repo_get_merge_bases(the_repository, head_commit,
+						 j->item, &common_one) < 0)
+				exit(128);
+
 			common_item = common_one->item;
 			free_commit_list(common_one);
 			if (!oideq(&common_item->object.oid, &j->item->object.oid)) {
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 043c65dccd9..06a55fc7325 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -879,7 +879,8 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream,
 	if (!upstream)
 		goto done;
 
-	merge_bases = repo_get_merge_bases(the_repository, upstream, head);
+	if (repo_get_merge_bases(the_repository, upstream, head, &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases || merge_bases->next)
 		goto done;
 
@@ -898,8 +899,9 @@ static void fill_branch_base(struct rebase_options *options,
 {
 	struct commit_list *merge_bases = NULL;
 
-	merge_bases = repo_get_merge_bases(the_repository, options->onto,
-					   options->orig_head);
+	if (repo_get_merge_bases(the_repository, options->onto,
+				 options->orig_head, &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases || merge_bases->next)
 		oidcpy(branch_base, null_oid());
 	else
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index fde8861ca4e..c97d0f6144c 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -297,7 +297,7 @@ static int try_difference(const char *arg)
 		show_rev(NORMAL, &end_oid, end);
 		show_rev(symmetric ? NORMAL : REVERSED, &start_oid, start);
 		if (symmetric) {
-			struct commit_list *exclude;
+			struct commit_list *exclude = NULL;
 			struct commit *a, *b;
 			a = lookup_commit_reference(the_repository, &start_oid);
 			b = lookup_commit_reference(the_repository, &end_oid);
@@ -305,7 +305,8 @@ static int try_difference(const char *arg)
 				*dotdot = '.';
 				return 0;
 			}
-			exclude = repo_get_merge_bases(the_repository, a, b);
+			if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0)
+				exit(128);
 			while (exclude) {
 				struct commit *commit = pop_commit(&exclude);
 				show_rev(REVERSED, &commit->object.oid, NULL);
diff --git a/commit-reach.c b/commit-reach.c
index 5fa0abc7d1e..10e625ff51b 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -189,9 +189,12 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 		struct commit_list *new_commits = NULL, *end = NULL;
 
 		for (j = ret; j; j = j->next) {
-			struct commit_list *bases;
-			bases = repo_get_merge_bases(the_repository, i->item,
-						     j->item);
+			struct commit_list *bases = NULL;
+			if (repo_get_merge_bases(the_repository, i->item,
+						 j->item, &bases) < 0) {
+				free_commit_list(bases);
+				return NULL;
+			}
 			if (!new_commits)
 				new_commits = bases;
 			else
@@ -483,16 +486,12 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 	return result;
 }
 
-struct commit_list *repo_get_merge_bases(struct repository *r,
-					 struct commit *one,
-					 struct commit *two)
+int repo_get_merge_bases(struct repository *r,
+			 struct commit *one,
+			 struct commit *two,
+			 struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, 1, &two, 1, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, 1, &two, 1, result);
 }
 
 /*
diff --git a/commit-reach.h b/commit-reach.h
index 68f81549a44..2c6fcdd34f6 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -9,9 +9,10 @@ struct ref_filter;
 struct object_id;
 struct object_array;
 
-struct commit_list *repo_get_merge_bases(struct repository *r,
-					 struct commit *rev1,
-					 struct commit *rev2);
+int repo_get_merge_bases(struct repository *r,
+			 struct commit *rev1,
+			 struct commit *rev2,
+			 struct commit_list **result);
 struct commit_list *repo_get_merge_bases_many(struct repository *r,
 					      struct commit *one, int n,
 					      struct commit **twos);
diff --git a/diff-lib.c b/diff-lib.c
index 0e9ec4f68af..498224ccce2 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -565,7 +565,7 @@ void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb)
 {
 	int i;
 	struct commit *mb_child[2] = {0};
-	struct commit_list *merge_bases;
+	struct commit_list *merge_bases = NULL;
 
 	for (i = 0; i < revs->pending.nr; i++) {
 		struct object *obj = revs->pending.objects[i].item;
@@ -592,7 +592,8 @@ void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb)
 		mb_child[1] = lookup_commit_reference(the_repository, &oid);
 	}
 
-	merge_bases = repo_get_merge_bases(the_repository, mb_child[0], mb_child[1]);
+	if (repo_get_merge_bases(the_repository, mb_child[0], mb_child[1], &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases)
 		die(_("no merge base found"));
 	if (merge_bases->next)
diff --git a/log-tree.c b/log-tree.c
index 504da6b519e..4f337766a39 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1010,7 +1010,7 @@ static int do_remerge_diff(struct rev_info *opt,
 			   struct object_id *oid)
 {
 	struct merge_options o;
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	struct merge_result res = {0};
 	struct pretty_print_context ctx = {0};
 	struct commit *parent1 = parents->item;
@@ -1035,7 +1035,8 @@ static int do_remerge_diff(struct rev_info *opt,
 	/* Parse the relevant commits and get the merge bases */
 	parse_commit_or_die(parent1);
 	parse_commit_or_die(parent2);
-	bases = repo_get_merge_bases(the_repository, parent1, parent2);
+	if (repo_get_merge_bases(the_repository, parent1, parent2, &bases) < 0)
+		exit(128);
 
 	/* Re-merge the parents */
 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
diff --git a/merge-ort.c b/merge-ort.c
index 9f3af46333a..90d8495ca1f 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -5066,7 +5066,11 @@ static void merge_ort_internal(struct merge_options *opt,
 	struct strbuf merge_base_abbrev = STRBUF_INIT;
 
 	if (!merge_bases) {
-		merge_bases = repo_get_merge_bases(the_repository, h1, h2);
+		if (repo_get_merge_bases(the_repository, h1, h2,
+					 &merge_bases) < 0) {
+			result->clean = -1;
+			return;
+		}
 		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
 		merge_bases = reverse_commit_list(merge_bases);
 	}
diff --git a/merge-recursive.c b/merge-recursive.c
index 0d931cc14ad..d609373d960 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3638,7 +3638,9 @@ static int merge_recursive_internal(struct merge_options *opt,
 	}
 
 	if (!merge_bases) {
-		merge_bases = repo_get_merge_bases(the_repository, h1, h2);
+		if (repo_get_merge_bases(the_repository, h1, h2,
+					 &merge_bases) < 0)
+			return -1;
 		merge_bases = reverse_commit_list(merge_bases);
 	}
 
diff --git a/notes-merge.c b/notes-merge.c
index 8799b522a55..51282934ae6 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -607,7 +607,8 @@ int notes_merge(struct notes_merge_options *o,
 	assert(local && remote);
 
 	/* Find merge bases */
-	bases = repo_get_merge_bases(the_repository, local, remote);
+	if (repo_get_merge_bases(the_repository, local, remote, &bases) < 0)
+		exit(128);
 	if (!bases) {
 		base_oid = null_oid();
 		base_tree_oid = the_hash_algo->empty_tree;
diff --git a/object-name.c b/object-name.c
index 0bfa29dbbfe..63bec6d9a2b 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1481,7 +1481,7 @@ int repo_get_oid_mb(struct repository *r,
 		    struct object_id *oid)
 {
 	struct commit *one, *two;
-	struct commit_list *mbs;
+	struct commit_list *mbs = NULL;
 	struct object_id oid_tmp;
 	const char *dots;
 	int st;
@@ -1509,7 +1509,10 @@ int repo_get_oid_mb(struct repository *r,
 	two = lookup_commit_reference_gently(r, &oid_tmp, 0);
 	if (!two)
 		return -1;
-	mbs = repo_get_merge_bases(r, one, two);
+	if (repo_get_merge_bases(r, one, two, &mbs) < 0) {
+		free_commit_list(mbs);
+		return -1;
+	}
 	if (!mbs || mbs->next)
 		st = -1;
 	else {
diff --git a/revision.c b/revision.c
index 00d5c29bfce..eb0d550842f 100644
--- a/revision.c
+++ b/revision.c
@@ -1965,7 +1965,7 @@ static void add_pending_commit_list(struct rev_info *revs,
 
 static void prepare_show_merge(struct rev_info *revs)
 {
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	struct commit *head, *other;
 	struct object_id oid;
 	const char **prune = NULL;
@@ -1980,7 +1980,8 @@ static void prepare_show_merge(struct rev_info *revs)
 	other = lookup_commit_or_die(&oid, "MERGE_HEAD");
 	add_pending_object(revs, &head->object, "HEAD");
 	add_pending_object(revs, &other->object, "MERGE_HEAD");
-	bases = repo_get_merge_bases(the_repository, head, other);
+	if (repo_get_merge_bases(the_repository, head, other, &bases) < 0)
+		exit(128);
 	add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING | BOTTOM);
 	add_pending_commit_list(revs, bases, UNINTERESTING | BOTTOM);
 	free_commit_list(bases);
@@ -2068,14 +2069,17 @@ static int handle_dotdot_1(const char *arg, char *dotdot,
 	} else {
 		/* A...B -- find merge bases between the two */
 		struct commit *a, *b;
-		struct commit_list *exclude;
+		struct commit_list *exclude = NULL;
 
 		a = lookup_commit_reference(revs->repo, &a_obj->oid);
 		b = lookup_commit_reference(revs->repo, &b_obj->oid);
 		if (!a || !b)
 			return dotdot_missing(arg, dotdot, revs, symmetric);
 
-		exclude = repo_get_merge_bases(the_repository, a, b);
+		if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0) {
+			free_commit_list(exclude);
+			return -1;
+		}
 		add_rev_cmdline_list(revs, exclude, REV_CMD_MERGE_BASE,
 				     flags_exclude);
 		add_pending_commit_list(revs, exclude, flags_exclude);
diff --git a/sequencer.c b/sequencer.c
index d584cac8ed9..4417f2f1956 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3913,7 +3913,7 @@ static int do_merge(struct repository *r,
 	int run_commit_flags = 0;
 	struct strbuf ref_name = STRBUF_INIT;
 	struct commit *head_commit, *merge_commit, *i;
-	struct commit_list *bases, *j;
+	struct commit_list *bases = NULL, *j;
 	struct commit_list *to_merge = NULL, **tail = &to_merge;
 	const char *strategy = !opts->xopts.nr &&
 		(!opts->strategy ||
@@ -4139,7 +4139,11 @@ static int do_merge(struct repository *r,
 	}
 
 	merge_commit = to_merge->item;
-	bases = repo_get_merge_bases(r, head_commit, merge_commit);
+	if (repo_get_merge_bases(r, head_commit, merge_commit, &bases) < 0) {
+		ret = -1;
+		goto leave_merge;
+	}
+
 	if (bases && oideq(&merge_commit->object.oid,
 			   &bases->item->object.oid)) {
 		ret = 0;
diff --git a/submodule.c b/submodule.c
index e603a19a876..04931a5474b 100644
--- a/submodule.c
+++ b/submodule.c
@@ -595,7 +595,12 @@ static void show_submodule_header(struct diff_options *o,
 	     (!is_null_oid(two) && !*right))
 		message = "(commits not present)";
 
-	*merge_bases = repo_get_merge_bases(sub, *left, *right);
+	*merge_bases = NULL;
+	if (repo_get_merge_bases(sub, *left, *right, merge_bases) < 0) {
+		message = "(corrupt repository)";
+		goto output_header;
+	}
+
 	if (*merge_bases) {
 		if ((*merge_bases)->item == *left)
 			fast_forward = 1;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index b2c8a43fce3..5d1e7aca4c8 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -945,4 +945,16 @@ test_expect_success 'check the input format when --stdin is passed' '
 	test_cmp expect actual
 '
 
+test_expect_success 'error out on missing commits as well' '
+	git init --bare missing-commit.git &&
+	git rev-list --objects side1 side3 >list-including-initial &&
+	grep -v ^$(git rev-parse side1^) <list-including-initial >list &&
+	git pack-objects missing-commit.git/objects/pack/missing-initial <list &&
+	side1=$(git rev-parse side1) &&
+	side3=$(git rev-parse side3) &&
+	test_must_fail git --git-dir=missing-commit.git \
+		merge-tree --allow-unrelated-histories $side1 $side3 >actual &&
+	test_must_be_empty actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 09/11] get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (7 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 08/11] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 10/11] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
                       ` (2 subsequent siblings)
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `get_merge_bases()` macro) is aware of that, too.

Naturally, the callers need to be adjusted now, too.

Next step: adjust `repo_get_merge_bases_many()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/merge-base.c |  8 ++++++--
 builtin/merge.c      |  6 +++++-
 builtin/pull.c       |  5 +++--
 commit-reach.c       | 20 +++++++++++---------
 commit-reach.h       |  2 +-
 5 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 0308fd73289..2edffc5487e 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -77,13 +77,17 @@ static int handle_independent(int count, const char **args)
 static int handle_octopus(int count, const char **args, int show_all)
 {
 	struct commit_list *revs = NULL;
-	struct commit_list *result, *rev;
+	struct commit_list *result = NULL, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = get_octopus_merge_bases(revs);
+	if (get_octopus_merge_bases(revs, &result) < 0) {
+		free_commit_list(revs);
+		free_commit_list(result);
+		return 128;
+	}
 	free_commit_list(revs);
 	reduce_heads_replace(&result);
 
diff --git a/builtin/merge.c b/builtin/merge.c
index ac9d58adc29..94c5b693972 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1526,7 +1526,11 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	} else {
 		struct commit_list *list = remoteheads;
 		commit_list_insert(head_commit, &list);
-		common = get_octopus_merge_bases(list);
+		if (get_octopus_merge_bases(list, &common) < 0) {
+			free(list);
+			ret = 2;
+			goto done;
+		}
 		free(list);
 	}
 
diff --git a/builtin/pull.c b/builtin/pull.c
index e6f2942c0c5..0c5a55f2f4d 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -820,7 +820,7 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 		const struct object_id *merge_head,
 		const struct object_id *fork_point)
 {
-	struct commit_list *revs = NULL, *result;
+	struct commit_list *revs = NULL, *result = NULL;
 
 	commit_list_insert(lookup_commit_reference(the_repository, curr_head),
 			   &revs);
@@ -830,7 +830,8 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 		commit_list_insert(lookup_commit_reference(the_repository, fork_point),
 				   &revs);
 
-	result = get_octopus_merge_bases(revs);
+	if (get_octopus_merge_bases(revs, &result) < 0)
+		exit(128);
 	free_commit_list(revs);
 	reduce_heads_replace(&result);
 
diff --git a/commit-reach.c b/commit-reach.c
index 10e625ff51b..fa21a8f2f6b 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -176,24 +176,26 @@ static int merge_bases_many(struct repository *r,
 	return 0;
 }
 
-struct commit_list *get_octopus_merge_bases(struct commit_list *in)
+int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result)
 {
-	struct commit_list *i, *j, *k, *ret = NULL;
+	struct commit_list *i, *j, *k;
 
 	if (!in)
-		return ret;
+		return 0;
 
-	commit_list_insert(in->item, &ret);
+	commit_list_insert(in->item, result);
 
 	for (i = in->next; i; i = i->next) {
 		struct commit_list *new_commits = NULL, *end = NULL;
 
-		for (j = ret; j; j = j->next) {
+		for (j = *result; j; j = j->next) {
 			struct commit_list *bases = NULL;
 			if (repo_get_merge_bases(the_repository, i->item,
 						 j->item, &bases) < 0) {
 				free_commit_list(bases);
-				return NULL;
+				free_commit_list(*result);
+				*result = NULL;
+				return -1;
 			}
 			if (!new_commits)
 				new_commits = bases;
@@ -202,10 +204,10 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 			for (k = bases; k; k = k->next)
 				end = k;
 		}
-		free_commit_list(ret);
-		ret = new_commits;
+		free_commit_list(*result);
+		*result = new_commits;
 	}
-	return ret;
+	return 0;
 }
 
 static int remove_redundant_no_gen(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 2c6fcdd34f6..4690b6ecd0c 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -21,7 +21,7 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    struct commit *one, int n,
 						    struct commit **twos);
 
-struct commit_list *get_octopus_merge_bases(struct commit_list *in);
+int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
 
 int repo_is_descendant_of(struct repository *r,
 			  struct commit *commit,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 10/11] repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (8 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 09/11] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-27 13:28     ` [PATCH v3 11/11] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many()` function is aware of
that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next stop: `repo_get_merge_bases_dirty()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 bisect.c              |  7 ++++---
 builtin/log.c         | 13 +++++++------
 commit-reach.c        | 16 ++++++----------
 commit-reach.h        |  7 ++++---
 commit.c              |  7 ++++---
 t/helper/test-reach.c |  9 ++++++---
 6 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/bisect.c b/bisect.c
index 1be8e0a2711..2018466d69f 100644
--- a/bisect.c
+++ b/bisect.c
@@ -851,10 +851,11 @@ static void handle_skipped_merge_base(const struct object_id *mb)
 static enum bisect_error check_merge_bases(int rev_nr, struct commit **rev, int no_checkout)
 {
 	enum bisect_error res = BISECT_OK;
-	struct commit_list *result;
+	struct commit_list *result = NULL;
 
-	result = repo_get_merge_bases_many(the_repository, rev[0], rev_nr - 1,
-					   rev + 1);
+	if (repo_get_merge_bases_many(the_repository, rev[0], rev_nr - 1,
+				      rev + 1, &result) < 0)
+		exit(128);
 
 	for (; result; result = result->next) {
 		const struct object_id *mb = &result->item->object.oid;
diff --git a/builtin/log.c b/builtin/log.c
index befafd6ae04..c75790a7cec 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1656,7 +1656,7 @@ static struct commit *get_base_commit(const char *base_commit,
 		struct branch *curr_branch = branch_get(NULL);
 		const char *upstream = branch_get_upstream(curr_branch, NULL);
 		if (upstream) {
-			struct commit_list *base_list;
+			struct commit_list *base_list = NULL;
 			struct commit *commit;
 			struct object_id oid;
 
@@ -1667,11 +1667,12 @@ static struct commit *get_base_commit(const char *base_commit,
 					return NULL;
 			}
 			commit = lookup_commit_or_die(&oid, "upstream base");
-			base_list = repo_get_merge_bases_many(the_repository,
-							      commit, total,
-							      list);
-			/* There should be one and only one merge base. */
-			if (!base_list || base_list->next) {
+			if (repo_get_merge_bases_many(the_repository,
+						      commit, total,
+						      list,
+						      &base_list) < 0 ||
+			    /* There should be one and only one merge base. */
+			    !base_list || base_list->next) {
 				if (die_on_failure) {
 					die(_("could not find exact merge base"));
 				} else {
diff --git a/commit-reach.c b/commit-reach.c
index fa21a8f2f6b..954a05399f1 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -462,17 +462,13 @@ static int get_merge_bases_many_0(struct repository *r,
 	return 0;
 }
 
-struct commit_list *repo_get_merge_bases_many(struct repository *r,
-					      struct commit *one,
-					      int n,
-					      struct commit **twos)
+int repo_get_merge_bases_many(struct repository *r,
+			      struct commit *one,
+			      int n,
+			      struct commit **twos,
+			      struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, n, twos, 1, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, n, twos, 1, result);
 }
 
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 4690b6ecd0c..458043f4d58 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -13,9 +13,10 @@ int repo_get_merge_bases(struct repository *r,
 			 struct commit *rev1,
 			 struct commit *rev2,
 			 struct commit_list **result);
-struct commit_list *repo_get_merge_bases_many(struct repository *r,
-					      struct commit *one, int n,
-					      struct commit **twos);
+int repo_get_merge_bases_many(struct repository *r,
+			      struct commit *one, int n,
+			      struct commit **twos,
+			      struct commit_list **result);
 /* To be used only when object flags after this call no longer matter */
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    struct commit *one, int n,
diff --git a/commit.c b/commit.c
index 8405d7c3fce..00add5d81c6 100644
--- a/commit.c
+++ b/commit.c
@@ -1054,7 +1054,7 @@ struct commit *get_fork_point(const char *refname, struct commit *commit)
 {
 	struct object_id oid;
 	struct rev_collect revs;
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	int i;
 	struct commit *ret = NULL;
 	char *full_refname;
@@ -1079,8 +1079,9 @@ struct commit *get_fork_point(const char *refname, struct commit *commit)
 	for (i = 0; i < revs.nr; i++)
 		revs.commit[i]->object.flags &= ~TMP_MARK;
 
-	bases = repo_get_merge_bases_many(the_repository, commit, revs.nr,
-					  revs.commit);
+	if (repo_get_merge_bases_many(the_repository, commit, revs.nr,
+				      revs.commit, &bases) < 0)
+		exit(128);
 
 	/*
 	 * There should be one and only one merge base, when we found
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index aa816e168ea..84ee9da8681 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -117,9 +117,12 @@ int cmd__reach(int ac, const char **av)
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
-		struct commit_list *list = repo_get_merge_bases_many(the_repository,
-								     A, X_nr,
-								     X_array);
+		struct commit_list *list = NULL;
+		if (repo_get_merge_bases_many(the_repository,
+					      A, X_nr,
+					      X_array,
+					      &list) < 0)
+			exit(128);
 		printf("%s(A,X):\n", av[1]);
 		print_sorted_commit_ids(list);
 	} else if (!strcmp(av[1], "reduce_heads")) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v3 11/11] repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (9 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 10/11] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
@ 2024-02-27 13:28     ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  11 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-27 13:28 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many_dirty()` function is
aware of that, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/merge-base.c |  9 ++++++---
 commit-reach.c       | 16 ++++++----------
 commit-reach.h       |  7 ++++---
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 2edffc5487e..a8a1ca53968 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -13,10 +13,13 @@
 
 static int show_merge_base(struct commit **rev, int rev_nr, int show_all)
 {
-	struct commit_list *result, *r;
+	struct commit_list *result = NULL, *r;
 
-	result = repo_get_merge_bases_many_dirty(the_repository, rev[0],
-						 rev_nr - 1, rev + 1);
+	if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
+					    rev_nr - 1, rev + 1, &result) < 0) {
+		free_commit_list(result);
+		return -1;
+	}
 
 	if (!result)
 		return 1;
diff --git a/commit-reach.c b/commit-reach.c
index 954a05399f1..2c69cb83d6f 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -471,17 +471,13 @@ int repo_get_merge_bases_many(struct repository *r,
 	return get_merge_bases_many_0(r, one, n, twos, 1, result);
 }
 
-struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
-						    struct commit *one,
-						    int n,
-						    struct commit **twos)
+int repo_get_merge_bases_many_dirty(struct repository *r,
+				    struct commit *one,
+				    int n,
+				    struct commit **twos,
+				    struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, n, twos, 0, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, n, twos, 0, result);
 }
 
 int repo_get_merge_bases(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 458043f4d58..bf63cc468fd 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -18,9 +18,10 @@ int repo_get_merge_bases_many(struct repository *r,
 			      struct commit **twos,
 			      struct commit_list **result);
 /* To be used only when object flags after this call no longer matter */
-struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
-						    struct commit *one, int n,
-						    struct commit **twos);
+int repo_get_merge_bases_many_dirty(struct repository *r,
+				    struct commit *one, int n,
+				    struct commit **twos,
+				    struct commit_list **result);
 
 int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits
  2024-02-27 13:28     ` [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
@ 2024-02-27 14:46       ` Dirk Gouders
  2024-02-27 15:00         ` Johannes Schindelin
  0 siblings, 1 reply; 70+ messages in thread
From: Dirk Gouders @ 2024-02-27 14:46 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Patrick Steinhardt, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Currently, that logic pretends that a missing parent commit is
                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> equivalent to a missing parent commit, and for the purpose of
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                  
> `--update-shallow` that is exactly what we need it to do.

Chances are that I am wrong, but to me the above sounds very irritating.

> Therefore, let's introduce a flag to indicate when that is precisely the
> logic we want.
>
> We need a flag, and cannot rely on `is_repository_shallow()` to indicate
> that situation, because that function would return 0: There may not
> actually be a `shallow` file, as demonstrated e.g. by t5537.10 ("add new

Again, I'm not a native speaker but I understand the above as
"There may not even be an existing `shallow` file...".

I'm not sure -- the sentence just "feels" uncomfortable to me.

Dirk

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()`
  2024-02-27 13:28     ` [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
@ 2024-02-27 14:56       ` Dirk Gouders
  2024-02-27 15:08         ` Johannes Schindelin
  0 siblings, 1 reply; 70+ messages in thread
From: Dirk Gouders @ 2024-02-27 14:56 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Patrick Steinhardt, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Let's start at the bottom of the stack by teaching the
> `paint_down_to_common()` function to return an `int`: if negative, it
> indicates fatal error, if 0 success.

Kind of pedantic but the above doesn't describe the real change, i.e. a
value != 0 indicates a fatal error:

> -		common = paint_down_to_common(r, array[i], filled,
> -					      work, min_generation, 0);
> +		if (paint_down_to_common(r, array[i], filled,
> +					 work, min_generation, 0, &common)) {
> +			clear_commit_marks(array[i], all_flags);
> +			clear_commit_marks_many(filled, work, all_flags);
> +			free_commit_list(common);
> +			free(work);
> +			free(redundant);
> +			free(filled_index);
> +			return -1;
> +		}

Dirk

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits
  2024-02-27 14:46       ` Dirk Gouders
@ 2024-02-27 15:00         ` Johannes Schindelin
  2024-02-27 18:08           ` Junio C Hamano
  0 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin @ 2024-02-27 15:00 UTC (permalink / raw)
  To: Dirk Gouders
  Cc: Johannes Schindelin via GitGitGadget, git, Patrick Steinhardt

[-- Attachment #1: Type: text/plain, Size: 1448 bytes --]

Hi Dirk,

On Tue, 27 Feb 2024, Dirk Gouders wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Currently, that logic pretends that a missing parent commit is
>                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > equivalent to a missing parent commit, and for the purpose of
>   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > `--update-shallow` that is exactly what we need it to do.
>
> Chances are that I am wrong, but to me the above sounds very irritating.

Not only that, it's also wrong 😜

> > Therefore, let's introduce a flag to indicate when that is precisely the
> > logic we want.
> >
> > We need a flag, and cannot rely on `is_repository_shallow()` to indicate
> > that situation, because that function would return 0: There may not
> > actually be a `shallow` file, as demonstrated e.g. by t5537.10 ("add new
>
> Again, I'm not a native speaker but I understand the above as
> "There may not even be an existing `shallow` file...".

I'm not a native speaker either, but I'll give it a try anyway. How about
this?

	Currently, that logic pretends that a commit whose parent
	commit is missing is a root commit (and likewise merge commits
	with missing parent commits are handled incorrectly, too).

	However, for the purpose of `--update-shallow` that is exactly
	what we need to do (and only then).

	Therefore [...]

Better?

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()`
  2024-02-27 14:56       ` Dirk Gouders
@ 2024-02-27 15:08         ` Johannes Schindelin
  2024-02-27 18:24           ` Junio C Hamano
  0 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin @ 2024-02-27 15:08 UTC (permalink / raw)
  To: Dirk Gouders
  Cc: Johannes Schindelin via GitGitGadget, git, Patrick Steinhardt

Hi Dirk,

On Tue, 27 Feb 2024, Dirk Gouders wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Let's start at the bottom of the stack by teaching the
> > `paint_down_to_common()` function to return an `int`: if negative, it
> > indicates fatal error, if 0 success.
>
> Kind of pedantic but the above doesn't describe the real change, i.e. a
> value != 0 indicates a fatal error:
>
> > -		common = paint_down_to_common(r, array[i], filled,
> > -					      work, min_generation, 0);
> > +		if (paint_down_to_common(r, array[i], filled,
> > +					 work, min_generation, 0, &common)) {

The fact that we do not bother to verify that the return value is
negative, but only check for a non-zero one instead, does not change the
fact that in the form this patch leaves the code, `paint_down_to_common()`
returns -1 for fatal errors and 0 for success, as advertised, though.

Is it lazy to omit the `< 0` here? Not actually, the reason why I omitted
it here was to stay under 80 columns per line.

Good eyes, though. If `paint_down_to_common()` _did_ return values other
than -1 and 0, in particular positive ones that would not indicate a fatal
error, the hunk under discussion would have introduced a problematic bug.

Ciao,
Johannes

> > +			clear_commit_marks(array[i], all_flags);
> > +			clear_commit_marks_many(filled, work, all_flags);
> > +			free_commit_list(common);
> > +			free(work);
> > +			free(redundant);
> > +			free(filled_index);
> > +			return -1;
> > +		}
>
> Dirk
>
>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits
  2024-02-27 15:00         ` Johannes Schindelin
@ 2024-02-27 18:08           ` Junio C Hamano
  2024-02-27 18:10             ` Junio C Hamano
  2024-02-27 19:07             ` Dirk Gouders
  0 siblings, 2 replies; 70+ messages in thread
From: Junio C Hamano @ 2024-02-27 18:08 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Dirk Gouders, Johannes Schindelin via GitGitGadget, git,
	Patrick Steinhardt

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> 	Currently, that logic pretends that a commit whose parent
> 	commit is missing is a root commit (and likewise merge commits
> 	with missing parent commits are handled incorrectly, too).
> 	However, for the purpose of `--update-shallow` that is exactly
> 	what we need to do (and only then).

I suspect that what made it harder to follow in the original
construct is that we called the behaviour "incorrect" upfront and
then come back with "that incorrectness is what we want".  I wonder
if it makes it easier to follow by flipping them around.

    For the purpose of `--update-shallow`, when some of the parent
    commits of a commit are missing from the repository, we need to
    treat as if the parents of the commit are only the ones that do
    exist in the repository and these missing commits have no
    ancestry relationship with it.  If all its parent commits are
    missing, the commit needs to be treated as if it were a root
    commit.

    Add a flag to optionally ask for such a behaviour, while
    detecting missing objects as a repository corruption error by
    default ...

or something?

> 	Therefore [...]
>
> Better?
>
> Ciao,
> Johannes

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits
  2024-02-27 18:08           ` Junio C Hamano
@ 2024-02-27 18:10             ` Junio C Hamano
  2024-02-27 19:07             ` Dirk Gouders
  1 sibling, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2024-02-27 18:10 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Dirk Gouders, Johannes Schindelin via GitGitGadget, git,
	Patrick Steinhardt

Junio C Hamano <gitster@pobox.com> writes:

> I suspect that what made it harder to follow in the original
> construct is that we called the behaviour "incorrect" upfront and
> then come back with "that incorrectness is what we want".

Oh, I forgot to say that lack of "<area>: " on the title of the
earlier parts of the series were a bit uncomfortable to read.



^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()`
  2024-02-27 15:08         ` Johannes Schindelin
@ 2024-02-27 18:24           ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2024-02-27 18:24 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Dirk Gouders, Johannes Schindelin via GitGitGadget, git,
	Patrick Steinhardt

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Is it lazy to omit the `< 0` here? Not actually, the reason why I omitted
> it here was to stay under 80 columns per line.
>
> Good eyes, though. If `paint_down_to_common()` _did_ return values other
> than -1 and 0, in particular positive ones that would not indicate a fatal
> error, the hunk under discussion would have introduced a problematic bug.

The same patch does compare the returned value with '< 0' in another
function (that is far from this place), which probably made the hunk
stand out during a review, I suspect.

After having fixed a bug elsewhere about a codepath that mixed the
"non-zero is an error" and "negative is an error" conventions during
the last cycle, I would have to say that being consistent would be
nice.  I think we at the end decided to make the callee be more
strict (returning negative to signal an error) while allowing the
caller to be more lenient (taking non-zero as an error) in that
case.

Thanks.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()`
  2024-02-27 13:28     ` [PATCH v3 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
@ 2024-02-27 18:29       ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2024-02-27 18:29 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Patrick Steinhardt, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The `paint_down_to_common()` function was just taught to indicate
> parsing errors, and now the `merge_bases_many()` function is aware of
> that, too.
>
> One tricky aspect is that `merge_bases_many()` parses commits of its
> own, but wants to gracefully handle the scenario where NULL is passed as
> a merge head, returning the empty list of merge bases. The way this was
> handled involved calling `repo_parse_commit(NULL)` and relying on it to
> return an error. This has to be done differently now so that we can
> handle missing commits correctly by producing a fatal error.
>
> Next step: adjust the caller of `merge_bases_many()`:
> `get_merge_bases_many_0()`.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  commit-reach.c | 35 ++++++++++++++++++++++-------------
>  1 file changed, 22 insertions(+), 13 deletions(-)

This is the true gem of the series.  The next steps may have to be
noisy due to many callers adjusting to the new function signatures,
but split into these logical steps in a bottom-up way makes them
easier to follow.  Nice.

>
> diff --git a/commit-reach.c b/commit-reach.c
> index 9148a7dcbc0..2c74583c8e0 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -131,41 +131,49 @@ static int paint_down_to_common(struct repository *r,
>  	return 0;
>  }
>  
> -static struct commit_list *merge_bases_many(struct repository *r,
> -					    struct commit *one, int n,
> -					    struct commit **twos)
> +static int merge_bases_many(struct repository *r,
> +			    struct commit *one, int n,
> +			    struct commit **twos,
> +			    struct commit_list **result)
>  {
>  	struct commit_list *list = NULL;
> -	struct commit_list *result = NULL;
>  	int i;
>  
>  	for (i = 0; i < n; i++) {
> -		if (one == twos[i])
> +		if (one == twos[i]) {
>  			/*
>  			 * We do not mark this even with RESULT so we do not
>  			 * have to clean it up.
>  			 */
> -			return commit_list_insert(one, &result);
> +			*result = commit_list_insert(one, result);
> +			return 0;
> +		}
>  	}
>  
> +	if (!one)
> +		return 0;
>  	if (repo_parse_commit(r, one))
> -		return NULL;
> +		return error(_("could not parse commit %s"),
> +			     oid_to_hex(&one->object.oid));
>  	for (i = 0; i < n; i++) {
> +		if (!twos[i])
> +			return 0;
>  		if (repo_parse_commit(r, twos[i]))
> -			return NULL;
> +			return error(_("could not parse commit %s"),
> +				     oid_to_hex(&twos[i]->object.oid));
>  	}
>  
>  	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
>  		free_commit_list(list);
> -		return NULL;
> +		return -1;
>  	}
>  
>  	while (list) {
>  		struct commit *commit = pop_commit(&list);
>  		if (!(commit->object.flags & STALE))
> -			commit_list_insert_by_date(commit, &result);
> +			commit_list_insert_by_date(commit, result);
>  	}
> -	return result;
> +	return 0;
>  }
>  
>  struct commit_list *get_octopus_merge_bases(struct commit_list *in)
> @@ -410,10 +418,11 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
>  {
>  	struct commit_list *list;
>  	struct commit **rslt;
> -	struct commit_list *result;
> +	struct commit_list *result = NULL;
>  	int cnt, i;
>  
> -	result = merge_bases_many(r, one, n, twos);
> +	if (merge_bases_many(r, one, n, twos, &result) < 0)
> +		return NULL;
>  	for (i = 0; i < n; i++) {
>  		if (one == twos[i])
>  			return result;

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits
  2024-02-27 18:08           ` Junio C Hamano
  2024-02-27 18:10             ` Junio C Hamano
@ 2024-02-27 19:07             ` Dirk Gouders
  1 sibling, 0 replies; 70+ messages in thread
From: Dirk Gouders @ 2024-02-27 19:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget, git,
	Patrick Steinhardt

Junio C Hamano <gitster@pobox.com> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
>> 	Currently, that logic pretends that a commit whose parent
>> 	commit is missing is a root commit (and likewise merge commits
>> 	with missing parent commits are handled incorrectly, too).
>> 	However, for the purpose of `--update-shallow` that is exactly
>> 	what we need to do (and only then).
>
> I suspect that what made it harder to follow in the original
> construct is that we called the behaviour "incorrect" upfront and
> then come back with "that incorrectness is what we want".  I wonder
> if it makes it easier to follow by flipping them around.
>
>     For the purpose of `--update-shallow`, when some of the parent
>     commits of a commit are missing from the repository, we need to
>     treat as if the parents of the commit are only the ones that do
>     exist in the repository and these missing commits have no
>     ancestry relationship with it.  If all its parent commits are
>     missing, the commit needs to be treated as if it were a root
>     commit.
>
>     Add a flag to optionally ask for such a behaviour, while
>     detecting missing objects as a repository corruption error by
>     default ...
>
> or something?

At least to me this is more understandable.

Dirk

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v4 00/11] The merge-base logic vs missing commit objects
  2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                       ` (10 preceding siblings ...)
  2024-02-27 13:28     ` [PATCH v3 11/11] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44     ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 01/11] commit-reach(paint_down_to_common): plug two memory leaks Johannes Schindelin via GitGitGadget
                         ` (10 more replies)
  11 siblings, 11 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin

This patch series is in the same spirit as what I proposed in
https://lore.kernel.org/git/pull.1651.git.1707212981.gitgitgadget@gmail.com/,
where I tackled missing tree objects. This here patch series tackles missing
commit objects instead: Git's merge-base logic handles these missing commit
objects as if there had not been any commit object at all, i.e. if two
commits' merge base commit is missing, they will be treated as if they had
no common commit history at all, which is a bug. Those commit objects should
not be missing, of course, i.e. this is only a problem in corrupt
repositories.

This patch series is a bit more tricky than the "missing tree objects" one,
though: The function signature of quite a few functions need to be changed
to allow callers to discern between "missing commit object" vs "no
merge-base found".

And of course it gets even more tricky because in shallow and partial clones
we expect commit objects to be missing, and that must not be treated like an
error but the existing logic is actually desirable in those scenarios.

I am deeply sorry both about the length of this patch series as well as
having to lean so heavily on reviews on the Git mailing list.

Changes since v3:

 * Reworded some hard-to-understand paragraphs in the commit message of
   "Prepare paint_down_to_common() for handling shallow commits" (4/11).
 * Changed an inconsistent paint_down_to_common() < 0 to simply test whether
   the return value is non-zero.
 * Changed all commit messages to have proper <area>: prefixes.

Changes since v2:

 * Moved a hunk from 3/11 to 2/11 that lets the repo_in_merge_bases_many()
   function return an error if non-existing commits have been passed to it,
   unless the ignore_missing_commits parameter is non-zero.
 * The end result is tree-same.

Changes since v1:

 * Addressed a lot of memory leaks.
 * Reordered patch 2 and 3 to render the commit message's comment about
   unchanged behavior true.
 * Fixed the incorrectly converted condition in the merge_submodule()
   function.
 * The last patch ("paint_down_to_common(): special-case shallow/partial
   clones") was dropped because it is not actually necessary, and the
   explanation for that was added to the commit message of "Prepare
   paint_down_to_common() for handling shallow commits".
 * An inconsistently-named variable i was renamed to be consistent with the
   other variables called ret.

Johannes Schindelin (11):
  commit-reach(paint_down_to_common): plug two memory leaks
  commit-reach(repo_in_merge_bases_many): optionally expect missing
    commits
  commit-reach(repo_in_merge_bases_many): report missing commits
  commit-reach(paint_down_to_common): prepare for handling shallow
    commits
  commit-reach(paint_down_to_common): start reporting errors
  commit-reach(merge_bases_many): pass on "missing commits" errors
  commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
  commit-reach(repo_get_merge_bases): pass on "missing commits" errors
  commit-reach(get_octopus_merge_bases): pass on "missing commits"
    errors
  commit-reach(repo_get_merge_bases_many): pass on "missing commits"
    errors
  commit-reach(repo_get_merge_bases_many_dirty): pass on errors

 bisect.c                         |   7 +-
 builtin/branch.c                 |  12 +-
 builtin/fast-import.c            |   6 +-
 builtin/fetch.c                  |   2 +
 builtin/log.c                    |  30 +++--
 builtin/merge-base.c             |  23 +++-
 builtin/merge-tree.c             |   5 +-
 builtin/merge.c                  |  26 ++--
 builtin/pull.c                   |   9 +-
 builtin/rebase.c                 |   8 +-
 builtin/receive-pack.c           |   6 +-
 builtin/rev-parse.c              |   5 +-
 commit-reach.c                   | 209 ++++++++++++++++++++-----------
 commit-reach.h                   |  26 ++--
 commit.c                         |   7 +-
 diff-lib.c                       |   5 +-
 http-push.c                      |   5 +-
 log-tree.c                       |   5 +-
 merge-ort.c                      |  87 +++++++++++--
 merge-recursive.c                |  58 +++++++--
 notes-merge.c                    |   3 +-
 object-name.c                    |   7 +-
 remote.c                         |   2 +-
 revision.c                       |  12 +-
 sequencer.c                      |   8 +-
 shallow.c                        |  21 ++--
 submodule.c                      |   7 +-
 t/helper/test-reach.c            |  11 +-
 t/t4301-merge-tree-write-tree.sh |  12 ++
 29 files changed, 441 insertions(+), 183 deletions(-)


base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1657%2Fdscho%2Fmerge-base-and-missing-objects-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1657/dscho/merge-base-and-missing-objects-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1657

Range-diff vs v3:

  1:  6e4e409cd43 !  1:  647fa2ed5c5 paint_down_to_common: plug two memory leaks
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    paint_down_to_common: plug two memory leaks
     +    commit-reach(paint_down_to_common): plug two memory leaks
      
          When a commit is missing, we return early (currently pretending that no
          merge basis could be found in that case). At that stage, it is possible
  2:  f68d77c6123 !  2:  48e69bf7229 Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
     +    commit-reach(repo_in_merge_bases_many): optionally expect missing commits
      
          Currently this function treats unrelated commit histories the same way
          as commit histories with missing commit objects.
  3:  0aaa224b5db !  3:  1938b317a49 Start reporting missing commits in `repo_in_merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    Start reporting missing commits in `repo_in_merge_bases_many()`
     +    commit-reach(repo_in_merge_bases_many): report missing commits
      
          Some functions in Git's source code follow the convention that returning
          a negative value indicates a fatal error, e.g. repository corruption.
  4:  84e7fbc07e0 !  4:  837aa5a89c6 Prepare `paint_down_to_common()` for handling shallow commits
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    Prepare `paint_down_to_common()` for handling shallow commits
     +    commit-reach(paint_down_to_common): prepare for handling shallow commits
      
          When `git fetch --update-shallow` needs to test for commit ancestry, it
          can naturally run into a missing object (e.g. if it is a parent of a
     -    shallow commit). To accommodate, the merge base logic will need to be
     -    able to handle this situation gracefully.
     +    shallow commit). For the purpose of `--update-shallow`, this needs to be
     +    treated as if the child commit did not even have that parent, i.e. the
     +    commit history needs to be clamped.
      
     -    Currently, that logic pretends that a missing parent commit is
     -    equivalent to a missing parent commit, and for the purpose of
     -    `--update-shallow` that is exactly what we need it to do.
     +    For all other scenarios, clamping the commit history is actually a bug,
     +    as it would hide repository corruption (for an analysis regarding
     +    shallow and partial clones, see the analysis further down).
      
     -    Therefore, let's introduce a flag to indicate when that is precisely the
     -    logic we want.
     +    Add a flag to optionally ask the function to ignore missing commits, as
     +    `--update-shallow` needs it to, while detecting missing objects as a
     +    repository corruption error by default.
      
     -    We need a flag, and cannot rely on `is_repository_shallow()` to indicate
     -    that situation, because that function would return 0: There may not
     -    actually be a `shallow` file, as demonstrated e.g. by t5537.10 ("add new
     -    shallow root with receive.updateshallow on") and t5538.4 ("add new
     -    shallow root with receive.updateshallow on").
     +    This flag is needed, and cannot replaced by `is_repository_shallow()` to
     +    indicate that situation, because that function would return 0 in the
     +    `--update-shallow` scenario: There is not actually a `shallow` file in
     +    that scenario, as demonstrated e.g. by t5537.10 ("add new shallow root
     +    with receive.updateshallow on") and t5538.4 ("add new shallow root with
     +    receive.updateshallow on").
      
          Note: shallow commits' parents are set to `NULL` internally already,
          therefore there is no need to special-case shallow repositories here, as
  5:  85332b58c37 !  5:  b978b5d233a commit-reach: start reporting errors in `paint_down_to_common()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    commit-reach: start reporting errors in `paint_down_to_common()`
     +    commit-reach(paint_down_to_common): start reporting errors
      
          If a commit cannot be parsed, it is currently ignored when looking for
          merge bases. That's undesirable as the operation can pretend success in
     @@ commit-reach.c: static struct commit_list *merge_bases_many(struct repository *r
       	}
       
      -	list = paint_down_to_common(r, one, n, twos, 0, 0);
     -+	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
     ++	if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
      +		free_commit_list(list);
      +		return NULL;
      +	}
  6:  2ae6a54dd59 !  6:  0f1ce130ce6 merge_bases_many(): pass on errors from `paint_down_to_common()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    merge_bases_many(): pass on errors from `paint_down_to_common()`
     +    commit-reach(merge_bases_many): pass on "missing commits" errors
      
          The `paint_down_to_common()` function was just taught to indicate
          parsing errors, and now the `merge_bases_many()` function is aware of
     @@ commit-reach.c: static int paint_down_to_common(struct repository *r,
      +				     oid_to_hex(&twos[i]->object.oid));
       	}
       
     - 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
     + 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
       		free_commit_list(list);
      -		return NULL;
      +		return -1;
  7:  4321795102d !  7:  133b69b6a62 get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
     +    commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
      
          The `merge_bases_many()` function was just taught to indicate
          parsing errors, and now the `get_merge_bases_many_0()` function is aware
  8:  35545c4b777 !  8:  bd52f258cd9 repo_get_merge_bases(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    repo_get_merge_bases(): pass on errors from `merge_bases_many()`
     +    commit-reach(repo_get_merge_bases): pass on "missing commits" errors
      
          The `merge_bases_many()` function was just taught to indicate parsing
          errors, and now the `repo_get_merge_bases()` function (which is also
  9:  a963058d2ba !  9:  b7ef90a57f0 get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
     +    commit-reach(get_octopus_merge_bases): pass on "missing commits" errors
      
          The `merge_bases_many()` function was just taught to indicate parsing
          errors, and now the `repo_get_merge_bases()` function (which is also
 10:  c3bed9c8500 ! 10:  32587b3caa7 repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
     +    commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors
      
          The `merge_bases_many()` function was just taught to indicate parsing
          errors, and now the `repo_get_merge_bases_many()` function is aware of
 11:  bdbf47ae505 ! 11:  05de9f24444 repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`
     +    commit-reach(repo_get_merge_bases_many_dirty): pass on errors
     +
     +    (Actually, this commit is only about passing on "missing commits"
     +    errors, but adding that to the commit's title would have made it too
     +    long.)
      
          The `merge_bases_many()` function was just taught to indicate parsing
          errors, and now the `repo_get_merge_bases_many_dirty()` function is

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v4 01/11] commit-reach(paint_down_to_common): plug two memory leaks
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 02/11] commit-reach(repo_in_merge_bases_many): optionally expect missing commits Johannes Schindelin via GitGitGadget
                         ` (9 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When a commit is missing, we return early (currently pretending that no
merge basis could be found in that case). At that stage, it is possible
that a merge base could have been found already, and added to the
`result`, which is now leaked.

The priority queue has a similar issue: There might still be a commit in
that queue.

Let's release both, to address the potential memory leaks.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/commit-reach.c b/commit-reach.c
index a868a575ea1..7ea916f9ebd 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -105,8 +105,11 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 			parents = parents->next;
 			if ((p->object.flags & flags) == flags)
 				continue;
-			if (repo_parse_commit(r, p))
+			if (repo_parse_commit(r, p)) {
+				clear_prio_queue(&queue);
+				free_commit_list(result);
 				return NULL;
+			}
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
 		}
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 02/11] commit-reach(repo_in_merge_bases_many): optionally expect missing commits
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 01/11] commit-reach(paint_down_to_common): plug two memory leaks Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 03/11] commit-reach(repo_in_merge_bases_many): report " Johannes Schindelin via GitGitGadget
                         ` (8 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Currently this function treats unrelated commit histories the same way
as commit histories with missing commit objects.

Typically, missing commit objects constitute a corrupt repository,
though, and should be reported as such. The next commits will make it
so, but there is one exception: In `git fetch --update-shallow` we
_expect_ commit objects to be missing, and we do want to treat the
now-incomplete commit histories as unrelated.

To allow for that, let's introduce an additional parameter that is
passed to `repo_in_merge_bases_many()` to trigger this behavior, and use
it in the two callers in `shallow.c`.

This commit changes behavior slightly: unless called from the
`shallow.c` functions that set the `ignore_missing_commits` bit, any
non-existing tip commit that is passed to `repo_in_merge_bases_many()`
will now result in an error.

Note: When encountering missing commits while traversing the commit
history in search for merge bases, with this commit there won't be a
change in behavior just yet, their children will still be interpreted as
root commits. This bug will get fixed by follow-up commits.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c        | 9 +++++----
 commit-reach.h        | 3 ++-
 remote.c              | 2 +-
 shallow.c             | 5 +++--
 t/helper/test-reach.c | 2 +-
 5 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 7ea916f9ebd..5c1b5256598 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -467,7 +467,7 @@ int repo_is_descendant_of(struct repository *r,
 
 			other = with_commit->item;
 			with_commit = with_commit->next;
-			if (repo_in_merge_bases_many(r, other, 1, &commit))
+			if (repo_in_merge_bases_many(r, other, 1, &commit, 0))
 				return 1;
 		}
 		return 0;
@@ -478,17 +478,18 @@ int repo_is_descendant_of(struct repository *r,
  * Is "commit" an ancestor of one of the "references"?
  */
 int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
-			     int nr_reference, struct commit **reference)
+			     int nr_reference, struct commit **reference,
+			     int ignore_missing_commits)
 {
 	struct commit_list *bases;
 	int ret = 0, i;
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
 	if (repo_parse_commit(r, commit))
-		return ret;
+		return ignore_missing_commits ? 0 : -1;
 	for (i = 0; i < nr_reference; i++) {
 		if (repo_parse_commit(r, reference[i]))
-			return ret;
+			return ignore_missing_commits ? 0 : -1;
 
 		generation = commit_graph_generation(reference[i]);
 		if (generation > max_generation)
diff --git a/commit-reach.h b/commit-reach.h
index 35c4da49481..68f81549a44 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -30,7 +30,8 @@ int repo_in_merge_bases(struct repository *r,
 			struct commit *reference);
 int repo_in_merge_bases_many(struct repository *r,
 			     struct commit *commit,
-			     int nr_reference, struct commit **reference);
+			     int nr_reference, struct commit **reference,
+			     int ignore_missing_commits);
 
 /*
  * Takes a list of commits and returns a new list where those
diff --git a/remote.c b/remote.c
index abb24822beb..763c80f4a7d 100644
--- a/remote.c
+++ b/remote.c
@@ -2675,7 +2675,7 @@ static int is_reachable_in_reflog(const char *local, const struct ref *remote)
 		if (MERGE_BASES_BATCH_SIZE < size)
 			size = MERGE_BASES_BATCH_SIZE;
 
-		if ((ret = repo_in_merge_bases_many(the_repository, commit, size, chunk)))
+		if ((ret = repo_in_merge_bases_many(the_repository, commit, size, chunk, 0)))
 			break;
 	}
 
diff --git a/shallow.c b/shallow.c
index ac728cdd778..dfcc1f86a7f 100644
--- a/shallow.c
+++ b/shallow.c
@@ -797,7 +797,7 @@ static void post_assign_shallow(struct shallow_info *info,
 		for (j = 0; j < bitmap_nr; j++)
 			if (bitmap[0][j] &&
 			    /* Step 7, reachability test at commit level */
-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits)) {
+			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1)) {
 				update_refstatus(ref_status, info->ref->nr, *bitmap);
 				dst++;
 				break;
@@ -828,7 +828,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 		si->reachable[c] = repo_in_merge_bases_many(the_repository,
 							    commit,
 							    si->nr_commits,
-							    si->commits);
+							    si->commits,
+							    1);
 		si->need_reachability_test[c] = 0;
 	}
 	return si->reachable[c];
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index 3e173399a00..aa816e168ea 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -113,7 +113,7 @@ int cmd__reach(int ac, const char **av)
 		       repo_in_merge_bases(the_repository, A, B));
 	else if (!strcmp(av[1], "in_merge_bases_many"))
 		printf("%s(A,X):%d\n", av[1],
-		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array));
+		       repo_in_merge_bases_many(the_repository, A, X_nr, X_array, 0));
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 03/11] commit-reach(repo_in_merge_bases_many): report missing commits
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 01/11] commit-reach(paint_down_to_common): plug two memory leaks Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 02/11] commit-reach(repo_in_merge_bases_many): optionally expect missing commits Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-03-01  6:58         ` Jeff King
  2024-02-28  9:44       ` [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits Johannes Schindelin via GitGitGadget
                         ` (7 subsequent siblings)
  10 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Some functions in Git's source code follow the convention that returning
a negative value indicates a fatal error, e.g. repository corruption.

Let's use this convention in `repo_in_merge_bases()` to report when one
of the specified commits is missing (i.e. when `repo_parse_commit()`
reports an error).

Also adjust the callers of `repo_in_merge_bases()` to handle such
negative return values.

Note: As of this patch, errors are returned only if any of the specified
merge heads is missing. Over the course of the next patches, missing
commits will also be reported by the `paint_down_to_common()` function,
which is called by `repo_in_merge_bases_many()`, and those errors will
be properly propagated back to the caller at that stage.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/branch.c       | 12 +++++--
 builtin/fast-import.c  |  6 +++-
 builtin/fetch.c        |  2 ++
 builtin/log.c          |  7 ++--
 builtin/merge-base.c   |  6 +++-
 builtin/pull.c         |  4 +++
 builtin/receive-pack.c |  6 +++-
 commit-reach.c         |  8 +++--
 http-push.c            |  5 ++-
 merge-ort.c            | 81 ++++++++++++++++++++++++++++++++++++------
 merge-recursive.c      | 54 +++++++++++++++++++++++-----
 shallow.c              | 18 ++++++----
 12 files changed, 173 insertions(+), 36 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index e7ee9bd0f15..7f9e79237f3 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -161,6 +161,8 @@ static int branch_merged(int kind, const char *name,
 
 	merged = reference_rev ? repo_in_merge_bases(the_repository, rev,
 						     reference_rev) : 0;
+	if (merged < 0)
+		exit(128);
 
 	/*
 	 * After the safety valve is fully redefined to "check with
@@ -169,9 +171,13 @@ static int branch_merged(int kind, const char *name,
 	 * any of the following code, but during the transition period,
 	 * a gentle reminder is in order.
 	 */
-	if ((head_rev != reference_rev) &&
-	    (head_rev ? repo_in_merge_bases(the_repository, rev, head_rev) : 0) != merged) {
-		if (merged)
+	if (head_rev != reference_rev) {
+		int expect = head_rev ? repo_in_merge_bases(the_repository, rev, head_rev) : 0;
+		if (expect < 0)
+			exit(128);
+		if (expect == merged)
+			; /* okay */
+		else if (merged)
 			warning(_("deleting branch '%s' that has been merged to\n"
 				"         '%s', but not yet merged to HEAD"),
 				name, reference_name);
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 444f41cf8ca..14c2efa88fc 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -1625,6 +1625,7 @@ static int update_branch(struct branch *b)
 		oidclr(&old_oid);
 	if (!force_update && !is_null_oid(&old_oid)) {
 		struct commit *old_cmit, *new_cmit;
+		int ret;
 
 		old_cmit = lookup_commit_reference_gently(the_repository,
 							  &old_oid, 0);
@@ -1633,7 +1634,10 @@ static int update_branch(struct branch *b)
 		if (!old_cmit || !new_cmit)
 			return error("Branch %s is missing commits.", b->name);
 
-		if (!repo_in_merge_bases(the_repository, old_cmit, new_cmit)) {
+		ret = repo_in_merge_bases(the_repository, old_cmit, new_cmit);
+		if (ret < 0)
+			exit(128);
+		if (!ret) {
 			warning("Not updating %s"
 				" (new tip %s does not contain %s)",
 				b->name, oid_to_hex(&b->oid),
diff --git a/builtin/fetch.c b/builtin/fetch.c
index fd134ba74d9..0584a1f8b64 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -978,6 +978,8 @@ static int update_local_ref(struct ref *ref,
 		uint64_t t_before = getnanotime();
 		fast_forward = repo_in_merge_bases(the_repository, current,
 						   updated);
+		if (fast_forward < 0)
+			exit(128);
 		forced_updates_ms += (getnanotime() - t_before) / 1000000;
 	} else {
 		fast_forward = 1;
diff --git a/builtin/log.c b/builtin/log.c
index ba775d7b5cf..1705da71aca 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1623,7 +1623,7 @@ static struct commit *get_base_commit(const char *base_commit,
 {
 	struct commit *base = NULL;
 	struct commit **rev;
-	int i = 0, rev_nr = 0, auto_select, die_on_failure;
+	int i = 0, rev_nr = 0, auto_select, die_on_failure, ret;
 
 	switch (auto_base) {
 	case AUTO_BASE_NEVER:
@@ -1723,7 +1723,10 @@ static struct commit *get_base_commit(const char *base_commit,
 		rev_nr = DIV_ROUND_UP(rev_nr, 2);
 	}
 
-	if (!repo_in_merge_bases(the_repository, base, rev[0])) {
+	ret = repo_in_merge_bases(the_repository, base, rev[0]);
+	if (ret < 0)
+		exit(128);
+	if (!ret) {
 		if (die_on_failure) {
 			die(_("base commit should be the ancestor of revision list"));
 		} else {
diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index e68b7fe45d7..0308fd73289 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -103,12 +103,16 @@ static int handle_octopus(int count, const char **args, int show_all)
 static int handle_is_ancestor(int argc, const char **argv)
 {
 	struct commit *one, *two;
+	int ret;
 
 	if (argc != 2)
 		die("--is-ancestor takes exactly two commits");
 	one = get_commit_reference(argv[0]);
 	two = get_commit_reference(argv[1]);
-	if (repo_in_merge_bases(the_repository, one, two))
+	ret = repo_in_merge_bases(the_repository, one, two);
+	if (ret < 0)
+		exit(128);
+	if (ret)
 		return 0;
 	else
 		return 1;
diff --git a/builtin/pull.c b/builtin/pull.c
index be2b2c9ebc9..e6f2942c0c5 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -931,6 +931,8 @@ static int get_can_ff(struct object_id *orig_head,
 	merge_head = lookup_commit_reference(the_repository, orig_merge_head);
 	ret = repo_is_descendant_of(the_repository, merge_head, list);
 	free_commit_list(list);
+	if (ret < 0)
+		exit(128);
 	return ret;
 }
 
@@ -955,6 +957,8 @@ static int already_up_to_date(struct object_id *orig_head,
 		commit_list_insert(theirs, &list);
 		ok = repo_is_descendant_of(the_repository, ours, list);
 		free_commit_list(list);
+		if (ok < 0)
+			exit(128);
 		if (!ok)
 			return 0;
 	}
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8c4f0cb90a9..956fea6293e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1546,6 +1546,7 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 	    starts_with(name, "refs/heads/")) {
 		struct object *old_object, *new_object;
 		struct commit *old_commit, *new_commit;
+		int ret2;
 
 		old_object = parse_object(the_repository, old_oid);
 		new_object = parse_object(the_repository, new_oid);
@@ -1559,7 +1560,10 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 		}
 		old_commit = (struct commit *)old_object;
 		new_commit = (struct commit *)new_object;
-		if (!repo_in_merge_bases(the_repository, old_commit, new_commit)) {
+		ret2 = repo_in_merge_bases(the_repository, old_commit, new_commit);
+		if (ret2 < 0)
+			exit(128);
+		if (!ret2) {
 			rp_error("denying non-fast-forward %s"
 				 " (you should pull first)", name);
 			ret = "non-fast-forward";
diff --git a/commit-reach.c b/commit-reach.c
index 5c1b5256598..5ff71d72d51 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -464,11 +464,13 @@ int repo_is_descendant_of(struct repository *r,
 	} else {
 		while (with_commit) {
 			struct commit *other;
+			int ret;
 
 			other = with_commit->item;
 			with_commit = with_commit->next;
-			if (repo_in_merge_bases_many(r, other, 1, &commit, 0))
-				return 1;
+			ret = repo_in_merge_bases_many(r, other, 1, &commit, 0);
+			if (ret)
+				return ret;
 		}
 		return 0;
 	}
@@ -598,6 +600,8 @@ int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid)
 	commit_list_insert(old_commit, &old_commit_list);
 	ret = repo_is_descendant_of(the_repository,
 				    new_commit, old_commit_list);
+	if (ret < 0)
+		exit(128);
 	free_commit_list(old_commit_list);
 	return ret;
 }
diff --git a/http-push.c b/http-push.c
index a704f490fdb..24c16a4f5ff 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1576,8 +1576,11 @@ static int verify_merge_base(struct object_id *head_oid, struct ref *remote)
 	struct commit *head = lookup_commit_or_die(head_oid, "HEAD");
 	struct commit *branch = lookup_commit_or_die(&remote->old_oid,
 						     remote->name);
+	int ret = repo_in_merge_bases(the_repository, branch, head);
 
-	return repo_in_merge_bases(the_repository, branch, head);
+	if (ret < 0)
+		exit(128);
+	return ret;
 }
 
 static int delete_remote_branch(const char *pattern, int force)
diff --git a/merge-ort.c b/merge-ort.c
index 6491070d965..9f3af46333a 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -544,6 +544,7 @@ enum conflict_and_info_types {
 	CONFLICT_SUBMODULE_HISTORY_NOT_AVAILABLE,
 	CONFLICT_SUBMODULE_MAY_HAVE_REWINDS,
 	CONFLICT_SUBMODULE_NULL_MERGE_BASE,
+	CONFLICT_SUBMODULE_CORRUPT,
 
 	/* Keep this entry _last_ in the list */
 	NB_CONFLICT_TYPES,
@@ -596,7 +597,9 @@ static const char *type_short_descriptions[] = {
 	[CONFLICT_SUBMODULE_MAY_HAVE_REWINDS] =
 		"CONFLICT (submodule may have rewinds)",
 	[CONFLICT_SUBMODULE_NULL_MERGE_BASE] =
-		"CONFLICT (submodule lacks merge base)"
+		"CONFLICT (submodule lacks merge base)",
+	[CONFLICT_SUBMODULE_CORRUPT] =
+		"CONFLICT (submodule corrupt)"
 };
 
 struct logical_conflict_info {
@@ -1710,7 +1713,14 @@ static int find_first_merges(struct repository *repo,
 		die("revision walk setup failed");
 	while ((commit = get_revision(&revs)) != NULL) {
 		struct object *o = &(commit->object);
-		if (repo_in_merge_bases(repo, b, commit))
+		int ret = repo_in_merge_bases(repo, b, commit);
+
+		if (ret < 0) {
+			object_array_clear(&merges);
+			release_revisions(&revs);
+			return ret;
+		}
+		if (ret > 0)
 			add_object_array(o, NULL, &merges);
 	}
 	reset_revision_walk();
@@ -1725,9 +1735,17 @@ static int find_first_merges(struct repository *repo,
 		contains_another = 0;
 		for (j = 0; j < merges.nr; j++) {
 			struct commit *m2 = (struct commit *) merges.objects[j].item;
-			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
-				contains_another = 1;
-				break;
+			if (i != j) {
+				int ret = repo_in_merge_bases(repo, m2, m1);
+				if (ret < 0) {
+					object_array_clear(&merges);
+					release_revisions(&revs);
+					return ret;
+				}
+				if (ret > 0) {
+					contains_another = 1;
+					break;
+				}
 			}
 		}
 
@@ -1749,7 +1767,7 @@ static int merge_submodule(struct merge_options *opt,
 {
 	struct repository subrepo;
 	struct strbuf sb = STRBUF_INIT;
-	int ret = 0;
+	int ret = 0, ret2;
 	struct commit *commit_o, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
@@ -1796,8 +1814,26 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* check whether both changes are forward */
-	if (!repo_in_merge_bases(&subrepo, commit_o, commit_a) ||
-	    !repo_in_merge_bases(&subrepo, commit_o, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_a);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0)
+		ret2 = repo_in_merge_bases(&subrepo, commit_o, commit_b);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (!ret2) {
 		path_msg(opt, CONFLICT_SUBMODULE_MAY_HAVE_REWINDS, 0,
 			 path, NULL, NULL, NULL,
 			 _("Failed to merge submodule %s "
@@ -1807,7 +1843,16 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* Case #1: a is contained in b or vice versa */
-	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0) {
 		oidcpy(result, b);
 		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
 			 path, NULL, NULL, NULL,
@@ -1816,7 +1861,16 @@ static int merge_submodule(struct merge_options *opt,
 		ret = 1;
 		goto cleanup;
 	}
-	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
+	if (ret2 < 0) {
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		goto cleanup;
+	}
+	if (ret2 > 0) {
 		oidcpy(result, a);
 		path_msg(opt, INFO_SUBMODULE_FAST_FORWARDING, 1,
 			 path, NULL, NULL, NULL,
@@ -1841,6 +1895,13 @@ static int merge_submodule(struct merge_options *opt,
 	parent_count = find_first_merges(&subrepo, path, commit_a, commit_b,
 					 &merges);
 	switch (parent_count) {
+	case -1:
+		path_msg(opt, CONFLICT_SUBMODULE_CORRUPT, 0,
+			 path, NULL, NULL, NULL,
+			 _("Failed to merge submodule %s "
+			   "(repository corrupt)"),
+			 path);
+		break;
 	case 0:
 		path_msg(opt, CONFLICT_SUBMODULE_FAILED_TO_MERGE, 0,
 			 path, NULL, NULL, NULL,
diff --git a/merge-recursive.c b/merge-recursive.c
index e3beb0801b1..0d931cc14ad 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1144,7 +1144,13 @@ static int find_first_merges(struct repository *repo,
 		die("revision walk setup failed");
 	while ((commit = get_revision(&revs)) != NULL) {
 		struct object *o = &(commit->object);
-		if (repo_in_merge_bases(repo, b, commit))
+		int ret = repo_in_merge_bases(repo, b, commit);
+		if (ret < 0) {
+			object_array_clear(&merges);
+			release_revisions(&revs);
+			return ret;
+		}
+		if (ret)
 			add_object_array(o, NULL, &merges);
 	}
 	reset_revision_walk();
@@ -1159,9 +1165,17 @@ static int find_first_merges(struct repository *repo,
 		contains_another = 0;
 		for (j = 0; j < merges.nr; j++) {
 			struct commit *m2 = (struct commit *) merges.objects[j].item;
-			if (i != j && repo_in_merge_bases(repo, m2, m1)) {
-				contains_another = 1;
-				break;
+			if (i != j) {
+				int ret = repo_in_merge_bases(repo, m2, m1);
+				if (ret < 0) {
+					object_array_clear(&merges);
+					release_revisions(&revs);
+					return ret;
+				}
+				if (ret > 0) {
+					contains_another = 1;
+					break;
+				}
 			}
 		}
 
@@ -1197,7 +1211,7 @@ static int merge_submodule(struct merge_options *opt,
 			   const struct object_id *b)
 {
 	struct repository subrepo;
-	int ret = 0;
+	int ret = 0, ret2;
 	struct commit *commit_base, *commit_a, *commit_b;
 	int parent_count;
 	struct object_array merges;
@@ -1234,14 +1248,29 @@ static int merge_submodule(struct merge_options *opt,
 	}
 
 	/* check whether both changes are forward */
-	if (!repo_in_merge_bases(&subrepo, commit_base, commit_a) ||
-	    !repo_in_merge_bases(&subrepo, commit_base, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_a);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2 > 0)
+		ret2 = repo_in_merge_bases(&subrepo, commit_base, commit_b);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (!ret2) {
 		output(opt, 1, _("Failed to merge submodule %s (commits don't follow merge-base)"), path);
 		goto cleanup;
 	}
 
 	/* Case #1: a is contained in b or vice versa */
-	if (repo_in_merge_bases(&subrepo, commit_a, commit_b)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_a, commit_b);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2) {
 		oidcpy(result, b);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
@@ -1254,7 +1283,12 @@ static int merge_submodule(struct merge_options *opt,
 		ret = 1;
 		goto cleanup;
 	}
-	if (repo_in_merge_bases(&subrepo, commit_b, commit_a)) {
+	ret2 = repo_in_merge_bases(&subrepo, commit_b, commit_a);
+	if (ret2 < 0) {
+		output(opt, 1, _("Failed to merge submodule %s (repository corrupt)"), path);
+		goto cleanup;
+	}
+	if (ret2) {
 		oidcpy(result, a);
 		if (show(opt, 3)) {
 			output(opt, 3, _("Fast-forwarding submodule %s to the following commit:"), path);
@@ -1402,6 +1436,8 @@ static int merge_mode_and_contents(struct merge_options *opt,
 							&o->oid,
 							&a->oid,
 							&b->oid);
+			if (result->clean < 0)
+				return -1;
 		} else if (S_ISLNK(a->mode)) {
 			switch (opt->recursive_variant) {
 			case MERGE_VARIANT_NORMAL:
diff --git a/shallow.c b/shallow.c
index dfcc1f86a7f..f71496f35c3 100644
--- a/shallow.c
+++ b/shallow.c
@@ -795,12 +795,16 @@ static void post_assign_shallow(struct shallow_info *info,
 		if (!*bitmap)
 			continue;
 		for (j = 0; j < bitmap_nr; j++)
-			if (bitmap[0][j] &&
-			    /* Step 7, reachability test at commit level */
-			    !repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1)) {
-				update_refstatus(ref_status, info->ref->nr, *bitmap);
-				dst++;
-				break;
+			if (bitmap[0][j]) {
+				/* Step 7, reachability test at commit level */
+				int ret = repo_in_merge_bases_many(the_repository, c, ca.nr, ca.commits, 1);
+				if (ret < 0)
+					exit(128);
+				if (!ret) {
+					update_refstatus(ref_status, info->ref->nr, *bitmap);
+					dst++;
+					break;
+				}
 			}
 	}
 	info->nr_ours = dst;
@@ -830,6 +834,8 @@ int delayed_reachability_test(struct shallow_info *si, int c)
 							    si->nr_commits,
 							    si->commits,
 							    1);
+		if (si->reachable[c] < 0)
+			exit(128);
 		si->need_reachability_test[c] = 0;
 	}
 	return si->reachable[c];
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                         ` (2 preceding siblings ...)
  2024-02-28  9:44       ` [PATCH v4 03/11] commit-reach(repo_in_merge_bases_many): report " Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28 20:24         ` Junio C Hamano
  2024-02-28  9:44       ` [PATCH v4 05/11] commit-reach(paint_down_to_common): start reporting errors Johannes Schindelin via GitGitGadget
                         ` (6 subsequent siblings)
  10 siblings, 1 reply; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

When `git fetch --update-shallow` needs to test for commit ancestry, it
can naturally run into a missing object (e.g. if it is a parent of a
shallow commit). For the purpose of `--update-shallow`, this needs to be
treated as if the child commit did not even have that parent, i.e. the
commit history needs to be clamped.

For all other scenarios, clamping the commit history is actually a bug,
as it would hide repository corruption (for an analysis regarding
shallow and partial clones, see the analysis further down).

Add a flag to optionally ask the function to ignore missing commits, as
`--update-shallow` needs it to, while detecting missing objects as a
repository corruption error by default.

This flag is needed, and cannot replaced by `is_repository_shallow()` to
indicate that situation, because that function would return 0 in the
`--update-shallow` scenario: There is not actually a `shallow` file in
that scenario, as demonstrated e.g. by t5537.10 ("add new shallow root
with receive.updateshallow on") and t5538.4 ("add new shallow root with
receive.updateshallow on").

Note: shallow commits' parents are set to `NULL` internally already,
therefore there is no need to special-case shallow repositories here, as
the merge-base logic will not try to access parent commits of shallow
commits.

Likewise, partial clones aren't an issue either: If a commit is missing
during the revision walk in the merge-base logic, it is fetched via
`promisor_remote_get_direct()`. And not only the single missing commit
object: Due to the way the "promised" objects are fetched (in
`fetch_objects()` in `promisor-remote.c`, using `fetch
--filter=blob:none`), there is no actual way to fetch a single commit
object, as the remote side will pass that commit OID to `pack-objects
--revs [...]` which in turn passes it to `rev-list` which interprets
this as a commit _range_ instead of a single object. Therefore, in
partial clones (unless they are shallow in addition), all commits
reachable from a commit that is in the local object database are also
present in that local database.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 5ff71d72d51..7112b10eeea 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -53,7 +53,8 @@ static int queue_has_nonstale(struct prio_queue *queue)
 static struct commit_list *paint_down_to_common(struct repository *r,
 						struct commit *one, int n,
 						struct commit **twos,
-						timestamp_t min_generation)
+						timestamp_t min_generation,
+						int ignore_missing_commits)
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
 	struct commit_list *result = NULL;
@@ -108,6 +109,13 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 			if (repo_parse_commit(r, p)) {
 				clear_prio_queue(&queue);
 				free_commit_list(result);
+				/*
+				 * At this stage, we know that the commit is
+				 * missing: `repo_parse_commit()` uses
+				 * `OBJECT_INFO_DIE_IF_CORRUPT` and therefore
+				 * corrupt commits would already have been
+				 * dispatched with a `die()`.
+				 */
 				return NULL;
 			}
 			p->object.flags |= flags;
@@ -143,7 +151,7 @@ static struct commit_list *merge_bases_many(struct repository *r,
 			return NULL;
 	}
 
-	list = paint_down_to_common(r, one, n, twos, 0);
+	list = paint_down_to_common(r, one, n, twos, 0, 0);
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -214,7 +222,7 @@ static int remove_redundant_no_gen(struct repository *r,
 				min_generation = curr_generation;
 		}
 		common = paint_down_to_common(r, array[i], filled,
-					      work, min_generation);
+					      work, min_generation, 0);
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -504,7 +512,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 
 	bases = paint_down_to_common(r, commit,
 				     nr_reference, reference,
-				     generation);
+				     generation, ignore_missing_commits);
 	if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 05/11] commit-reach(paint_down_to_common): start reporting errors
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                         ` (3 preceding siblings ...)
  2024-02-28  9:44       ` [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 06/11] commit-reach(merge_bases_many): pass on "missing commits" errors Johannes Schindelin via GitGitGadget
                         ` (5 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

If a commit cannot be parsed, it is currently ignored when looking for
merge bases. That's undesirable as the operation can pretend success in
a corrupt repository, even though the command should fail with an error
message.

Let's start at the bottom of the stack by teaching the
`paint_down_to_common()` function to return an `int`: if negative, it
indicates fatal error, if 0 success.

This requires a couple of callers to be adjusted accordingly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 66 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 45 insertions(+), 21 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 7112b10eeea..9ad5f9db4f7 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -50,14 +50,14 @@ static int queue_has_nonstale(struct prio_queue *queue)
 }
 
 /* all input commits in one and twos[] must have been parsed! */
-static struct commit_list *paint_down_to_common(struct repository *r,
-						struct commit *one, int n,
-						struct commit **twos,
-						timestamp_t min_generation,
-						int ignore_missing_commits)
+static int paint_down_to_common(struct repository *r,
+				struct commit *one, int n,
+				struct commit **twos,
+				timestamp_t min_generation,
+				int ignore_missing_commits,
+				struct commit_list **result)
 {
 	struct prio_queue queue = { compare_commits_by_gen_then_commit_date };
-	struct commit_list *result = NULL;
 	int i;
 	timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
 
@@ -66,8 +66,8 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 
 	one->object.flags |= PARENT1;
 	if (!n) {
-		commit_list_append(one, &result);
-		return result;
+		commit_list_append(one, result);
+		return 0;
 	}
 	prio_queue_put(&queue, one);
 
@@ -95,7 +95,7 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 		if (flags == (PARENT1 | PARENT2)) {
 			if (!(commit->object.flags & RESULT)) {
 				commit->object.flags |= RESULT;
-				commit_list_insert_by_date(commit, &result);
+				commit_list_insert_by_date(commit, result);
 			}
 			/* Mark parents of a found merge stale */
 			flags |= STALE;
@@ -108,7 +108,8 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 				continue;
 			if (repo_parse_commit(r, p)) {
 				clear_prio_queue(&queue);
-				free_commit_list(result);
+				free_commit_list(*result);
+				*result = NULL;
 				/*
 				 * At this stage, we know that the commit is
 				 * missing: `repo_parse_commit()` uses
@@ -116,7 +117,10 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 				 * corrupt commits would already have been
 				 * dispatched with a `die()`.
 				 */
-				return NULL;
+				if (ignore_missing_commits)
+					return 0;
+				return error(_("could not parse commit %s"),
+					     oid_to_hex(&p->object.oid));
 			}
 			p->object.flags |= flags;
 			prio_queue_put(&queue, p);
@@ -124,7 +128,7 @@ static struct commit_list *paint_down_to_common(struct repository *r,
 	}
 
 	clear_prio_queue(&queue);
-	return result;
+	return 0;
 }
 
 static struct commit_list *merge_bases_many(struct repository *r,
@@ -151,7 +155,10 @@ static struct commit_list *merge_bases_many(struct repository *r,
 			return NULL;
 	}
 
-	list = paint_down_to_common(r, one, n, twos, 0, 0);
+	if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
+		free_commit_list(list);
+		return NULL;
+	}
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
@@ -205,7 +212,7 @@ static int remove_redundant_no_gen(struct repository *r,
 	for (i = 0; i < cnt; i++)
 		repo_parse_commit(r, array[i]);
 	for (i = 0; i < cnt; i++) {
-		struct commit_list *common;
+		struct commit_list *common = NULL;
 		timestamp_t min_generation = commit_graph_generation(array[i]);
 
 		if (redundant[i])
@@ -221,8 +228,16 @@ static int remove_redundant_no_gen(struct repository *r,
 			if (curr_generation < min_generation)
 				min_generation = curr_generation;
 		}
-		common = paint_down_to_common(r, array[i], filled,
-					      work, min_generation, 0);
+		if (paint_down_to_common(r, array[i], filled,
+					 work, min_generation, 0, &common)) {
+			clear_commit_marks(array[i], all_flags);
+			clear_commit_marks_many(filled, work, all_flags);
+			free_commit_list(common);
+			free(work);
+			free(redundant);
+			free(filled_index);
+			return -1;
+		}
 		if (array[i]->object.flags & PARENT2)
 			redundant[i] = 1;
 		for (j = 0; j < filled; j++)
@@ -422,6 +437,10 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	clear_commit_marks_many(n, twos, all_flags);
 
 	cnt = remove_redundant(r, rslt, cnt);
+	if (cnt < 0) {
+		free(rslt);
+		return NULL;
+	}
 	result = NULL;
 	for (i = 0; i < cnt; i++)
 		commit_list_insert_by_date(rslt[i], &result);
@@ -491,7 +510,7 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 			     int nr_reference, struct commit **reference,
 			     int ignore_missing_commits)
 {
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	int ret = 0, i;
 	timestamp_t generation, max_generation = GENERATION_NUMBER_ZERO;
 
@@ -510,10 +529,11 @@ int repo_in_merge_bases_many(struct repository *r, struct commit *commit,
 	if (generation > max_generation)
 		return ret;
 
-	bases = paint_down_to_common(r, commit,
-				     nr_reference, reference,
-				     generation, ignore_missing_commits);
-	if (commit->object.flags & PARENT2)
+	if (paint_down_to_common(r, commit,
+				 nr_reference, reference,
+				 generation, ignore_missing_commits, &bases))
+		ret = -1;
+	else if (commit->object.flags & PARENT2)
 		ret = 1;
 	clear_commit_marks(commit, all_flags);
 	clear_commit_marks_many(nr_reference, reference, all_flags);
@@ -566,6 +586,10 @@ struct commit_list *reduce_heads(struct commit_list *heads)
 		}
 	}
 	num_head = remove_redundant(the_repository, array, num_head);
+	if (num_head < 0) {
+		free(array);
+		return NULL;
+	}
 	for (i = 0; i < num_head; i++)
 		tail = &commit_list_insert(array[i], tail)->next;
 	free(array);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 06/11] commit-reach(merge_bases_many): pass on "missing commits" errors
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                         ` (4 preceding siblings ...)
  2024-02-28  9:44       ` [PATCH v4 05/11] commit-reach(paint_down_to_common): start reporting errors Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 07/11] commit-reach(get_merge_bases_many_0): " Johannes Schindelin via GitGitGadget
                         ` (4 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `paint_down_to_common()` function was just taught to indicate
parsing errors, and now the `merge_bases_many()` function is aware of
that, too.

One tricky aspect is that `merge_bases_many()` parses commits of its
own, but wants to gracefully handle the scenario where NULL is passed as
a merge head, returning the empty list of merge bases. The way this was
handled involved calling `repo_parse_commit(NULL)` and relying on it to
return an error. This has to be done differently now so that we can
handle missing commits correctly by producing a fatal error.

Next step: adjust the caller of `merge_bases_many()`:
`get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 9ad5f9db4f7..ddfec5289dd 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -131,41 +131,49 @@ static int paint_down_to_common(struct repository *r,
 	return 0;
 }
 
-static struct commit_list *merge_bases_many(struct repository *r,
-					    struct commit *one, int n,
-					    struct commit **twos)
+static int merge_bases_many(struct repository *r,
+			    struct commit *one, int n,
+			    struct commit **twos,
+			    struct commit_list **result)
 {
 	struct commit_list *list = NULL;
-	struct commit_list *result = NULL;
 	int i;
 
 	for (i = 0; i < n; i++) {
-		if (one == twos[i])
+		if (one == twos[i]) {
 			/*
 			 * We do not mark this even with RESULT so we do not
 			 * have to clean it up.
 			 */
-			return commit_list_insert(one, &result);
+			*result = commit_list_insert(one, result);
+			return 0;
+		}
 	}
 
+	if (!one)
+		return 0;
 	if (repo_parse_commit(r, one))
-		return NULL;
+		return error(_("could not parse commit %s"),
+			     oid_to_hex(&one->object.oid));
 	for (i = 0; i < n; i++) {
+		if (!twos[i])
+			return 0;
 		if (repo_parse_commit(r, twos[i]))
-			return NULL;
+			return error(_("could not parse commit %s"),
+				     oid_to_hex(&twos[i]->object.oid));
 	}
 
 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
 		free_commit_list(list);
-		return NULL;
+		return -1;
 	}
 
 	while (list) {
 		struct commit *commit = pop_commit(&list);
 		if (!(commit->object.flags & STALE))
-			commit_list_insert_by_date(commit, &result);
+			commit_list_insert_by_date(commit, result);
 	}
-	return result;
+	return 0;
 }
 
 struct commit_list *get_octopus_merge_bases(struct commit_list *in)
@@ -410,10 +418,11 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 {
 	struct commit_list *list;
 	struct commit **rslt;
-	struct commit_list *result;
+	struct commit_list *result = NULL;
 	int cnt, i;
 
-	result = merge_bases_many(r, one, n, twos);
+	if (merge_bases_many(r, one, n, twos, &result) < 0)
+		return NULL;
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
 			return result;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 07/11] commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                         ` (5 preceding siblings ...)
  2024-02-28  9:44       ` [PATCH v4 06/11] commit-reach(merge_bases_many): pass on "missing commits" errors Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 08/11] commit-reach(repo_get_merge_bases): " Johannes Schindelin via GitGitGadget
                         ` (3 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate
parsing errors, and now the `get_merge_bases_many_0()` function is aware
of that, too.

Next step: adjust the callers of `get_merge_bases_many_0()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 commit-reach.c | 57 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index ddfec5289dd..c0123eb7207 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -410,37 +410,38 @@ static int remove_redundant(struct repository *r, struct commit **array, int cnt
 	return remove_redundant_no_gen(r, array, cnt);
 }
 
-static struct commit_list *get_merge_bases_many_0(struct repository *r,
-						  struct commit *one,
-						  int n,
-						  struct commit **twos,
-						  int cleanup)
+static int get_merge_bases_many_0(struct repository *r,
+				  struct commit *one,
+				  int n,
+				  struct commit **twos,
+				  int cleanup,
+				  struct commit_list **result)
 {
 	struct commit_list *list;
 	struct commit **rslt;
-	struct commit_list *result = NULL;
 	int cnt, i;
 
-	if (merge_bases_many(r, one, n, twos, &result) < 0)
-		return NULL;
+	if (merge_bases_many(r, one, n, twos, result) < 0)
+		return -1;
 	for (i = 0; i < n; i++) {
 		if (one == twos[i])
-			return result;
+			return 0;
 	}
-	if (!result || !result->next) {
+	if (!*result || !(*result)->next) {
 		if (cleanup) {
 			clear_commit_marks(one, all_flags);
 			clear_commit_marks_many(n, twos, all_flags);
 		}
-		return result;
+		return 0;
 	}
 
 	/* There are more than one */
-	cnt = commit_list_count(result);
+	cnt = commit_list_count(*result);
 	CALLOC_ARRAY(rslt, cnt);
-	for (list = result, i = 0; list; list = list->next)
+	for (list = *result, i = 0; list; list = list->next)
 		rslt[i++] = list->item;
-	free_commit_list(result);
+	free_commit_list(*result);
+	*result = NULL;
 
 	clear_commit_marks(one, all_flags);
 	clear_commit_marks_many(n, twos, all_flags);
@@ -448,13 +449,12 @@ static struct commit_list *get_merge_bases_many_0(struct repository *r,
 	cnt = remove_redundant(r, rslt, cnt);
 	if (cnt < 0) {
 		free(rslt);
-		return NULL;
+		return -1;
 	}
-	result = NULL;
 	for (i = 0; i < cnt; i++)
-		commit_list_insert_by_date(rslt[i], &result);
+		commit_list_insert_by_date(rslt[i], result);
 	free(rslt);
-	return result;
+	return 0;
 }
 
 struct commit_list *repo_get_merge_bases_many(struct repository *r,
@@ -462,7 +462,12 @@ struct commit_list *repo_get_merge_bases_many(struct repository *r,
 					      int n,
 					      struct commit **twos)
 {
-	return get_merge_bases_many_0(r, one, n, twos, 1);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, n, twos, 1, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
@@ -470,14 +475,24 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    int n,
 						    struct commit **twos)
 {
-	return get_merge_bases_many_0(r, one, n, twos, 0);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, n, twos, 0, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 struct commit_list *repo_get_merge_bases(struct repository *r,
 					 struct commit *one,
 					 struct commit *two)
 {
-	return get_merge_bases_many_0(r, one, 1, &two, 1);
+	struct commit_list *result = NULL;
+	if (get_merge_bases_many_0(r, one, 1, &two, 1, &result) < 0) {
+		free_commit_list(result);
+		return NULL;
+	}
+	return result;
 }
 
 /*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 08/11] commit-reach(repo_get_merge_bases): pass on "missing commits" errors
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                         ` (6 preceding siblings ...)
  2024-02-28  9:44       ` [PATCH v4 07/11] commit-reach(get_merge_bases_many_0): " Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 09/11] commit-reach(get_octopus_merge_bases): " Johannes Schindelin via GitGitGadget
                         ` (2 subsequent siblings)
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `repo_get_merge_bases()` macro) is aware of that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next step: adjust the callers of `get_octopus_merge_bases()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/log.c                    | 10 +++++-----
 builtin/merge-tree.c             |  5 +++--
 builtin/merge.c                  | 20 ++++++++++++--------
 builtin/rebase.c                 |  8 +++++---
 builtin/rev-parse.c              |  5 +++--
 commit-reach.c                   | 23 +++++++++++------------
 commit-reach.h                   |  7 ++++---
 diff-lib.c                       |  5 +++--
 log-tree.c                       |  5 +++--
 merge-ort.c                      |  6 +++++-
 merge-recursive.c                |  4 +++-
 notes-merge.c                    |  3 ++-
 object-name.c                    |  7 +++++--
 revision.c                       | 12 ++++++++----
 sequencer.c                      |  8 ++++++--
 submodule.c                      |  7 ++++++-
 t/t4301-merge-tree-write-tree.sh | 12 ++++++++++++
 17 files changed, 96 insertions(+), 51 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 1705da71aca..befafd6ae04 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1702,11 +1702,11 @@ static struct commit *get_base_commit(const char *base_commit,
 	 */
 	while (rev_nr > 1) {
 		for (i = 0; i < rev_nr / 2; i++) {
-			struct commit_list *merge_base;
-			merge_base = repo_get_merge_bases(the_repository,
-							  rev[2 * i],
-							  rev[2 * i + 1]);
-			if (!merge_base || merge_base->next) {
+			struct commit_list *merge_base = NULL;
+			if (repo_get_merge_bases(the_repository,
+						 rev[2 * i],
+						 rev[2 * i + 1], &merge_base) < 0 ||
+			    !merge_base || merge_base->next) {
 				if (die_on_failure) {
 					die(_("failed to find exact merge base"));
 				} else {
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index a35e0452d66..76200250629 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -463,8 +463,9 @@ static int real_merge(struct merge_tree_options *o,
 		 * Get the merge bases, in reverse order; see comment above
 		 * merge_incore_recursive in merge-ort.h
 		 */
-		merge_bases = repo_get_merge_bases(the_repository, parent1,
-						   parent2);
+		if (repo_get_merge_bases(the_repository, parent1,
+					 parent2, &merge_bases) < 0)
+			exit(128);
 		if (!merge_bases && !o->allow_unrelated_histories)
 			die(_("refusing to merge unrelated histories"));
 		merge_bases = reverse_commit_list(merge_bases);
diff --git a/builtin/merge.c b/builtin/merge.c
index d748d46e135..ac9d58adc29 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1517,10 +1517,13 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 
 	if (!remoteheads)
 		; /* already up-to-date */
-	else if (!remoteheads->next)
-		common = repo_get_merge_bases(the_repository, head_commit,
-					      remoteheads->item);
-	else {
+	else if (!remoteheads->next) {
+		if (repo_get_merge_bases(the_repository, head_commit,
+					 remoteheads->item, &common) < 0) {
+			ret = 2;
+			goto done;
+		}
+	} else {
 		struct commit_list *list = remoteheads;
 		commit_list_insert(head_commit, &list);
 		common = get_octopus_merge_bases(list);
@@ -1631,7 +1634,7 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 		struct commit_list *j;
 
 		for (j = remoteheads; j; j = j->next) {
-			struct commit_list *common_one;
+			struct commit_list *common_one = NULL;
 			struct commit *common_item;
 
 			/*
@@ -1639,9 +1642,10 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 			 * merge_bases again, otherwise "git merge HEAD^
 			 * HEAD^^" would be missed.
 			 */
-			common_one = repo_get_merge_bases(the_repository,
-							  head_commit,
-							  j->item);
+			if (repo_get_merge_bases(the_repository, head_commit,
+						 j->item, &common_one) < 0)
+				exit(128);
+
 			common_item = common_one->item;
 			free_commit_list(common_one);
 			if (!oideq(&common_item->object.oid, &j->item->object.oid)) {
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 043c65dccd9..06a55fc7325 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -879,7 +879,8 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream,
 	if (!upstream)
 		goto done;
 
-	merge_bases = repo_get_merge_bases(the_repository, upstream, head);
+	if (repo_get_merge_bases(the_repository, upstream, head, &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases || merge_bases->next)
 		goto done;
 
@@ -898,8 +899,9 @@ static void fill_branch_base(struct rebase_options *options,
 {
 	struct commit_list *merge_bases = NULL;
 
-	merge_bases = repo_get_merge_bases(the_repository, options->onto,
-					   options->orig_head);
+	if (repo_get_merge_bases(the_repository, options->onto,
+				 options->orig_head, &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases || merge_bases->next)
 		oidcpy(branch_base, null_oid());
 	else
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index fde8861ca4e..c97d0f6144c 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -297,7 +297,7 @@ static int try_difference(const char *arg)
 		show_rev(NORMAL, &end_oid, end);
 		show_rev(symmetric ? NORMAL : REVERSED, &start_oid, start);
 		if (symmetric) {
-			struct commit_list *exclude;
+			struct commit_list *exclude = NULL;
 			struct commit *a, *b;
 			a = lookup_commit_reference(the_repository, &start_oid);
 			b = lookup_commit_reference(the_repository, &end_oid);
@@ -305,7 +305,8 @@ static int try_difference(const char *arg)
 				*dotdot = '.';
 				return 0;
 			}
-			exclude = repo_get_merge_bases(the_repository, a, b);
+			if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0)
+				exit(128);
 			while (exclude) {
 				struct commit *commit = pop_commit(&exclude);
 				show_rev(REVERSED, &commit->object.oid, NULL);
diff --git a/commit-reach.c b/commit-reach.c
index c0123eb7207..b2e90687f7c 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -189,9 +189,12 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 		struct commit_list *new_commits = NULL, *end = NULL;
 
 		for (j = ret; j; j = j->next) {
-			struct commit_list *bases;
-			bases = repo_get_merge_bases(the_repository, i->item,
-						     j->item);
+			struct commit_list *bases = NULL;
+			if (repo_get_merge_bases(the_repository, i->item,
+						 j->item, &bases) < 0) {
+				free_commit_list(bases);
+				return NULL;
+			}
 			if (!new_commits)
 				new_commits = bases;
 			else
@@ -483,16 +486,12 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 	return result;
 }
 
-struct commit_list *repo_get_merge_bases(struct repository *r,
-					 struct commit *one,
-					 struct commit *two)
+int repo_get_merge_bases(struct repository *r,
+			 struct commit *one,
+			 struct commit *two,
+			 struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, 1, &two, 1, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, 1, &two, 1, result);
 }
 
 /*
diff --git a/commit-reach.h b/commit-reach.h
index 68f81549a44..2c6fcdd34f6 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -9,9 +9,10 @@ struct ref_filter;
 struct object_id;
 struct object_array;
 
-struct commit_list *repo_get_merge_bases(struct repository *r,
-					 struct commit *rev1,
-					 struct commit *rev2);
+int repo_get_merge_bases(struct repository *r,
+			 struct commit *rev1,
+			 struct commit *rev2,
+			 struct commit_list **result);
 struct commit_list *repo_get_merge_bases_many(struct repository *r,
 					      struct commit *one, int n,
 					      struct commit **twos);
diff --git a/diff-lib.c b/diff-lib.c
index 0e9ec4f68af..498224ccce2 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -565,7 +565,7 @@ void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb)
 {
 	int i;
 	struct commit *mb_child[2] = {0};
-	struct commit_list *merge_bases;
+	struct commit_list *merge_bases = NULL;
 
 	for (i = 0; i < revs->pending.nr; i++) {
 		struct object *obj = revs->pending.objects[i].item;
@@ -592,7 +592,8 @@ void diff_get_merge_base(const struct rev_info *revs, struct object_id *mb)
 		mb_child[1] = lookup_commit_reference(the_repository, &oid);
 	}
 
-	merge_bases = repo_get_merge_bases(the_repository, mb_child[0], mb_child[1]);
+	if (repo_get_merge_bases(the_repository, mb_child[0], mb_child[1], &merge_bases) < 0)
+		exit(128);
 	if (!merge_bases)
 		die(_("no merge base found"));
 	if (merge_bases->next)
diff --git a/log-tree.c b/log-tree.c
index 504da6b519e..4f337766a39 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -1010,7 +1010,7 @@ static int do_remerge_diff(struct rev_info *opt,
 			   struct object_id *oid)
 {
 	struct merge_options o;
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	struct merge_result res = {0};
 	struct pretty_print_context ctx = {0};
 	struct commit *parent1 = parents->item;
@@ -1035,7 +1035,8 @@ static int do_remerge_diff(struct rev_info *opt,
 	/* Parse the relevant commits and get the merge bases */
 	parse_commit_or_die(parent1);
 	parse_commit_or_die(parent2);
-	bases = repo_get_merge_bases(the_repository, parent1, parent2);
+	if (repo_get_merge_bases(the_repository, parent1, parent2, &bases) < 0)
+		exit(128);
 
 	/* Re-merge the parents */
 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
diff --git a/merge-ort.c b/merge-ort.c
index 9f3af46333a..90d8495ca1f 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -5066,7 +5066,11 @@ static void merge_ort_internal(struct merge_options *opt,
 	struct strbuf merge_base_abbrev = STRBUF_INIT;
 
 	if (!merge_bases) {
-		merge_bases = repo_get_merge_bases(the_repository, h1, h2);
+		if (repo_get_merge_bases(the_repository, h1, h2,
+					 &merge_bases) < 0) {
+			result->clean = -1;
+			return;
+		}
 		/* See merge-ort.h:merge_incore_recursive() declaration NOTE */
 		merge_bases = reverse_commit_list(merge_bases);
 	}
diff --git a/merge-recursive.c b/merge-recursive.c
index 0d931cc14ad..d609373d960 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -3638,7 +3638,9 @@ static int merge_recursive_internal(struct merge_options *opt,
 	}
 
 	if (!merge_bases) {
-		merge_bases = repo_get_merge_bases(the_repository, h1, h2);
+		if (repo_get_merge_bases(the_repository, h1, h2,
+					 &merge_bases) < 0)
+			return -1;
 		merge_bases = reverse_commit_list(merge_bases);
 	}
 
diff --git a/notes-merge.c b/notes-merge.c
index 8799b522a55..51282934ae6 100644
--- a/notes-merge.c
+++ b/notes-merge.c
@@ -607,7 +607,8 @@ int notes_merge(struct notes_merge_options *o,
 	assert(local && remote);
 
 	/* Find merge bases */
-	bases = repo_get_merge_bases(the_repository, local, remote);
+	if (repo_get_merge_bases(the_repository, local, remote, &bases) < 0)
+		exit(128);
 	if (!bases) {
 		base_oid = null_oid();
 		base_tree_oid = the_hash_algo->empty_tree;
diff --git a/object-name.c b/object-name.c
index 0bfa29dbbfe..63bec6d9a2b 100644
--- a/object-name.c
+++ b/object-name.c
@@ -1481,7 +1481,7 @@ int repo_get_oid_mb(struct repository *r,
 		    struct object_id *oid)
 {
 	struct commit *one, *two;
-	struct commit_list *mbs;
+	struct commit_list *mbs = NULL;
 	struct object_id oid_tmp;
 	const char *dots;
 	int st;
@@ -1509,7 +1509,10 @@ int repo_get_oid_mb(struct repository *r,
 	two = lookup_commit_reference_gently(r, &oid_tmp, 0);
 	if (!two)
 		return -1;
-	mbs = repo_get_merge_bases(r, one, two);
+	if (repo_get_merge_bases(r, one, two, &mbs) < 0) {
+		free_commit_list(mbs);
+		return -1;
+	}
 	if (!mbs || mbs->next)
 		st = -1;
 	else {
diff --git a/revision.c b/revision.c
index 00d5c29bfce..eb0d550842f 100644
--- a/revision.c
+++ b/revision.c
@@ -1965,7 +1965,7 @@ static void add_pending_commit_list(struct rev_info *revs,
 
 static void prepare_show_merge(struct rev_info *revs)
 {
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	struct commit *head, *other;
 	struct object_id oid;
 	const char **prune = NULL;
@@ -1980,7 +1980,8 @@ static void prepare_show_merge(struct rev_info *revs)
 	other = lookup_commit_or_die(&oid, "MERGE_HEAD");
 	add_pending_object(revs, &head->object, "HEAD");
 	add_pending_object(revs, &other->object, "MERGE_HEAD");
-	bases = repo_get_merge_bases(the_repository, head, other);
+	if (repo_get_merge_bases(the_repository, head, other, &bases) < 0)
+		exit(128);
 	add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING | BOTTOM);
 	add_pending_commit_list(revs, bases, UNINTERESTING | BOTTOM);
 	free_commit_list(bases);
@@ -2068,14 +2069,17 @@ static int handle_dotdot_1(const char *arg, char *dotdot,
 	} else {
 		/* A...B -- find merge bases between the two */
 		struct commit *a, *b;
-		struct commit_list *exclude;
+		struct commit_list *exclude = NULL;
 
 		a = lookup_commit_reference(revs->repo, &a_obj->oid);
 		b = lookup_commit_reference(revs->repo, &b_obj->oid);
 		if (!a || !b)
 			return dotdot_missing(arg, dotdot, revs, symmetric);
 
-		exclude = repo_get_merge_bases(the_repository, a, b);
+		if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0) {
+			free_commit_list(exclude);
+			return -1;
+		}
 		add_rev_cmdline_list(revs, exclude, REV_CMD_MERGE_BASE,
 				     flags_exclude);
 		add_pending_commit_list(revs, exclude, flags_exclude);
diff --git a/sequencer.c b/sequencer.c
index d584cac8ed9..4417f2f1956 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3913,7 +3913,7 @@ static int do_merge(struct repository *r,
 	int run_commit_flags = 0;
 	struct strbuf ref_name = STRBUF_INIT;
 	struct commit *head_commit, *merge_commit, *i;
-	struct commit_list *bases, *j;
+	struct commit_list *bases = NULL, *j;
 	struct commit_list *to_merge = NULL, **tail = &to_merge;
 	const char *strategy = !opts->xopts.nr &&
 		(!opts->strategy ||
@@ -4139,7 +4139,11 @@ static int do_merge(struct repository *r,
 	}
 
 	merge_commit = to_merge->item;
-	bases = repo_get_merge_bases(r, head_commit, merge_commit);
+	if (repo_get_merge_bases(r, head_commit, merge_commit, &bases) < 0) {
+		ret = -1;
+		goto leave_merge;
+	}
+
 	if (bases && oideq(&merge_commit->object.oid,
 			   &bases->item->object.oid)) {
 		ret = 0;
diff --git a/submodule.c b/submodule.c
index e603a19a876..04931a5474b 100644
--- a/submodule.c
+++ b/submodule.c
@@ -595,7 +595,12 @@ static void show_submodule_header(struct diff_options *o,
 	     (!is_null_oid(two) && !*right))
 		message = "(commits not present)";
 
-	*merge_bases = repo_get_merge_bases(sub, *left, *right);
+	*merge_bases = NULL;
+	if (repo_get_merge_bases(sub, *left, *right, merge_bases) < 0) {
+		message = "(corrupt repository)";
+		goto output_header;
+	}
+
 	if (*merge_bases) {
 		if ((*merge_bases)->item == *left)
 			fast_forward = 1;
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index b2c8a43fce3..5d1e7aca4c8 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -945,4 +945,16 @@ test_expect_success 'check the input format when --stdin is passed' '
 	test_cmp expect actual
 '
 
+test_expect_success 'error out on missing commits as well' '
+	git init --bare missing-commit.git &&
+	git rev-list --objects side1 side3 >list-including-initial &&
+	grep -v ^$(git rev-parse side1^) <list-including-initial >list &&
+	git pack-objects missing-commit.git/objects/pack/missing-initial <list &&
+	side1=$(git rev-parse side1) &&
+	side3=$(git rev-parse side3) &&
+	test_must_fail git --git-dir=missing-commit.git \
+		merge-tree --allow-unrelated-histories $side1 $side3 >actual &&
+	test_must_be_empty actual
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 09/11] commit-reach(get_octopus_merge_bases): pass on "missing commits" errors
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                         ` (7 preceding siblings ...)
  2024-02-28  9:44       ` [PATCH v4 08/11] commit-reach(repo_get_merge_bases): " Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 10/11] commit-reach(repo_get_merge_bases_many): " Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 11/11] commit-reach(repo_get_merge_bases_many_dirty): pass on errors Johannes Schindelin via GitGitGadget
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases()` function (which is also
surfaced via the `get_merge_bases()` macro) is aware of that, too.

Naturally, the callers need to be adjusted now, too.

Next step: adjust `repo_get_merge_bases_many()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/merge-base.c |  8 ++++++--
 builtin/merge.c      |  6 +++++-
 builtin/pull.c       |  5 +++--
 commit-reach.c       | 20 +++++++++++---------
 commit-reach.h       |  2 +-
 5 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 0308fd73289..2edffc5487e 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -77,13 +77,17 @@ static int handle_independent(int count, const char **args)
 static int handle_octopus(int count, const char **args, int show_all)
 {
 	struct commit_list *revs = NULL;
-	struct commit_list *result, *rev;
+	struct commit_list *result = NULL, *rev;
 	int i;
 
 	for (i = count - 1; i >= 0; i--)
 		commit_list_insert(get_commit_reference(args[i]), &revs);
 
-	result = get_octopus_merge_bases(revs);
+	if (get_octopus_merge_bases(revs, &result) < 0) {
+		free_commit_list(revs);
+		free_commit_list(result);
+		return 128;
+	}
 	free_commit_list(revs);
 	reduce_heads_replace(&result);
 
diff --git a/builtin/merge.c b/builtin/merge.c
index ac9d58adc29..94c5b693972 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1526,7 +1526,11 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
 	} else {
 		struct commit_list *list = remoteheads;
 		commit_list_insert(head_commit, &list);
-		common = get_octopus_merge_bases(list);
+		if (get_octopus_merge_bases(list, &common) < 0) {
+			free(list);
+			ret = 2;
+			goto done;
+		}
 		free(list);
 	}
 
diff --git a/builtin/pull.c b/builtin/pull.c
index e6f2942c0c5..0c5a55f2f4d 100644
--- a/builtin/pull.c
+++ b/builtin/pull.c
@@ -820,7 +820,7 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 		const struct object_id *merge_head,
 		const struct object_id *fork_point)
 {
-	struct commit_list *revs = NULL, *result;
+	struct commit_list *revs = NULL, *result = NULL;
 
 	commit_list_insert(lookup_commit_reference(the_repository, curr_head),
 			   &revs);
@@ -830,7 +830,8 @@ static int get_octopus_merge_base(struct object_id *merge_base,
 		commit_list_insert(lookup_commit_reference(the_repository, fork_point),
 				   &revs);
 
-	result = get_octopus_merge_bases(revs);
+	if (get_octopus_merge_bases(revs, &result) < 0)
+		exit(128);
 	free_commit_list(revs);
 	reduce_heads_replace(&result);
 
diff --git a/commit-reach.c b/commit-reach.c
index b2e90687f7c..288d3d896a2 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -176,24 +176,26 @@ static int merge_bases_many(struct repository *r,
 	return 0;
 }
 
-struct commit_list *get_octopus_merge_bases(struct commit_list *in)
+int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result)
 {
-	struct commit_list *i, *j, *k, *ret = NULL;
+	struct commit_list *i, *j, *k;
 
 	if (!in)
-		return ret;
+		return 0;
 
-	commit_list_insert(in->item, &ret);
+	commit_list_insert(in->item, result);
 
 	for (i = in->next; i; i = i->next) {
 		struct commit_list *new_commits = NULL, *end = NULL;
 
-		for (j = ret; j; j = j->next) {
+		for (j = *result; j; j = j->next) {
 			struct commit_list *bases = NULL;
 			if (repo_get_merge_bases(the_repository, i->item,
 						 j->item, &bases) < 0) {
 				free_commit_list(bases);
-				return NULL;
+				free_commit_list(*result);
+				*result = NULL;
+				return -1;
 			}
 			if (!new_commits)
 				new_commits = bases;
@@ -202,10 +204,10 @@ struct commit_list *get_octopus_merge_bases(struct commit_list *in)
 			for (k = bases; k; k = k->next)
 				end = k;
 		}
-		free_commit_list(ret);
-		ret = new_commits;
+		free_commit_list(*result);
+		*result = new_commits;
 	}
-	return ret;
+	return 0;
 }
 
 static int remove_redundant_no_gen(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 2c6fcdd34f6..4690b6ecd0c 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -21,7 +21,7 @@ struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    struct commit *one, int n,
 						    struct commit **twos);
 
-struct commit_list *get_octopus_merge_bases(struct commit_list *in);
+int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
 
 int repo_is_descendant_of(struct repository *r,
 			  struct commit *commit,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 10/11] commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                         ` (8 preceding siblings ...)
  2024-02-28  9:44       ` [PATCH v4 09/11] commit-reach(get_octopus_merge_bases): " Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  2024-02-28  9:44       ` [PATCH v4 11/11] commit-reach(repo_get_merge_bases_many_dirty): pass on errors Johannes Schindelin via GitGitGadget
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many()` function is aware of
that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next stop: `repo_get_merge_bases_dirty()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 bisect.c              |  7 ++++---
 builtin/log.c         | 13 +++++++------
 commit-reach.c        | 16 ++++++----------
 commit-reach.h        |  7 ++++---
 commit.c              |  7 ++++---
 t/helper/test-reach.c |  9 ++++++---
 6 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/bisect.c b/bisect.c
index 1be8e0a2711..2018466d69f 100644
--- a/bisect.c
+++ b/bisect.c
@@ -851,10 +851,11 @@ static void handle_skipped_merge_base(const struct object_id *mb)
 static enum bisect_error check_merge_bases(int rev_nr, struct commit **rev, int no_checkout)
 {
 	enum bisect_error res = BISECT_OK;
-	struct commit_list *result;
+	struct commit_list *result = NULL;
 
-	result = repo_get_merge_bases_many(the_repository, rev[0], rev_nr - 1,
-					   rev + 1);
+	if (repo_get_merge_bases_many(the_repository, rev[0], rev_nr - 1,
+				      rev + 1, &result) < 0)
+		exit(128);
 
 	for (; result; result = result->next) {
 		const struct object_id *mb = &result->item->object.oid;
diff --git a/builtin/log.c b/builtin/log.c
index befafd6ae04..c75790a7cec 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1656,7 +1656,7 @@ static struct commit *get_base_commit(const char *base_commit,
 		struct branch *curr_branch = branch_get(NULL);
 		const char *upstream = branch_get_upstream(curr_branch, NULL);
 		if (upstream) {
-			struct commit_list *base_list;
+			struct commit_list *base_list = NULL;
 			struct commit *commit;
 			struct object_id oid;
 
@@ -1667,11 +1667,12 @@ static struct commit *get_base_commit(const char *base_commit,
 					return NULL;
 			}
 			commit = lookup_commit_or_die(&oid, "upstream base");
-			base_list = repo_get_merge_bases_many(the_repository,
-							      commit, total,
-							      list);
-			/* There should be one and only one merge base. */
-			if (!base_list || base_list->next) {
+			if (repo_get_merge_bases_many(the_repository,
+						      commit, total,
+						      list,
+						      &base_list) < 0 ||
+			    /* There should be one and only one merge base. */
+			    !base_list || base_list->next) {
 				if (die_on_failure) {
 					die(_("could not find exact merge base"));
 				} else {
diff --git a/commit-reach.c b/commit-reach.c
index 288d3d896a2..0aeaef25343 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -462,17 +462,13 @@ static int get_merge_bases_many_0(struct repository *r,
 	return 0;
 }
 
-struct commit_list *repo_get_merge_bases_many(struct repository *r,
-					      struct commit *one,
-					      int n,
-					      struct commit **twos)
+int repo_get_merge_bases_many(struct repository *r,
+			      struct commit *one,
+			      int n,
+			      struct commit **twos,
+			      struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, n, twos, 1, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, n, twos, 1, result);
 }
 
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 4690b6ecd0c..458043f4d58 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -13,9 +13,10 @@ int repo_get_merge_bases(struct repository *r,
 			 struct commit *rev1,
 			 struct commit *rev2,
 			 struct commit_list **result);
-struct commit_list *repo_get_merge_bases_many(struct repository *r,
-					      struct commit *one, int n,
-					      struct commit **twos);
+int repo_get_merge_bases_many(struct repository *r,
+			      struct commit *one, int n,
+			      struct commit **twos,
+			      struct commit_list **result);
 /* To be used only when object flags after this call no longer matter */
 struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
 						    struct commit *one, int n,
diff --git a/commit.c b/commit.c
index 8405d7c3fce..00add5d81c6 100644
--- a/commit.c
+++ b/commit.c
@@ -1054,7 +1054,7 @@ struct commit *get_fork_point(const char *refname, struct commit *commit)
 {
 	struct object_id oid;
 	struct rev_collect revs;
-	struct commit_list *bases;
+	struct commit_list *bases = NULL;
 	int i;
 	struct commit *ret = NULL;
 	char *full_refname;
@@ -1079,8 +1079,9 @@ struct commit *get_fork_point(const char *refname, struct commit *commit)
 	for (i = 0; i < revs.nr; i++)
 		revs.commit[i]->object.flags &= ~TMP_MARK;
 
-	bases = repo_get_merge_bases_many(the_repository, commit, revs.nr,
-					  revs.commit);
+	if (repo_get_merge_bases_many(the_repository, commit, revs.nr,
+				      revs.commit, &bases) < 0)
+		exit(128);
 
 	/*
 	 * There should be one and only one merge base, when we found
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index aa816e168ea..84ee9da8681 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -117,9 +117,12 @@ int cmd__reach(int ac, const char **av)
 	else if (!strcmp(av[1], "is_descendant_of"))
 		printf("%s(A,X):%d\n", av[1], repo_is_descendant_of(r, A, X));
 	else if (!strcmp(av[1], "get_merge_bases_many")) {
-		struct commit_list *list = repo_get_merge_bases_many(the_repository,
-								     A, X_nr,
-								     X_array);
+		struct commit_list *list = NULL;
+		if (repo_get_merge_bases_many(the_repository,
+					      A, X_nr,
+					      X_array,
+					      &list) < 0)
+			exit(128);
 		printf("%s(A,X):\n", av[1]);
 		print_sorted_commit_ids(list);
 	} else if (!strcmp(av[1], "reduce_heads")) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v4 11/11] commit-reach(repo_get_merge_bases_many_dirty): pass on errors
  2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
                         ` (9 preceding siblings ...)
  2024-02-28  9:44       ` [PATCH v4 10/11] commit-reach(repo_get_merge_bases_many): " Johannes Schindelin via GitGitGadget
@ 2024-02-28  9:44       ` Johannes Schindelin via GitGitGadget
  10 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2024-02-28  9:44 UTC (permalink / raw)
  To: git
  Cc: Patrick Steinhardt, Dirk Gouders, Johannes Schindelin,
	Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

(Actually, this commit is only about passing on "missing commits"
errors, but adding that to the commit's title would have made it too
long.)

The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many_dirty()` function is
aware of that, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/merge-base.c |  9 ++++++---
 commit-reach.c       | 16 ++++++----------
 commit-reach.h       |  7 ++++---
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/builtin/merge-base.c b/builtin/merge-base.c
index 2edffc5487e..a8a1ca53968 100644
--- a/builtin/merge-base.c
+++ b/builtin/merge-base.c
@@ -13,10 +13,13 @@
 
 static int show_merge_base(struct commit **rev, int rev_nr, int show_all)
 {
-	struct commit_list *result, *r;
+	struct commit_list *result = NULL, *r;
 
-	result = repo_get_merge_bases_many_dirty(the_repository, rev[0],
-						 rev_nr - 1, rev + 1);
+	if (repo_get_merge_bases_many_dirty(the_repository, rev[0],
+					    rev_nr - 1, rev + 1, &result) < 0) {
+		free_commit_list(result);
+		return -1;
+	}
 
 	if (!result)
 		return 1;
diff --git a/commit-reach.c b/commit-reach.c
index 0aeaef25343..202776b29e9 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -471,17 +471,13 @@ int repo_get_merge_bases_many(struct repository *r,
 	return get_merge_bases_many_0(r, one, n, twos, 1, result);
 }
 
-struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
-						    struct commit *one,
-						    int n,
-						    struct commit **twos)
+int repo_get_merge_bases_many_dirty(struct repository *r,
+				    struct commit *one,
+				    int n,
+				    struct commit **twos,
+				    struct commit_list **result)
 {
-	struct commit_list *result = NULL;
-	if (get_merge_bases_many_0(r, one, n, twos, 0, &result) < 0) {
-		free_commit_list(result);
-		return NULL;
-	}
-	return result;
+	return get_merge_bases_many_0(r, one, n, twos, 0, result);
 }
 
 int repo_get_merge_bases(struct repository *r,
diff --git a/commit-reach.h b/commit-reach.h
index 458043f4d58..bf63cc468fd 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -18,9 +18,10 @@ int repo_get_merge_bases_many(struct repository *r,
 			      struct commit **twos,
 			      struct commit_list **result);
 /* To be used only when object flags after this call no longer matter */
-struct commit_list *repo_get_merge_bases_many_dirty(struct repository *r,
-						    struct commit *one, int n,
-						    struct commit **twos);
+int repo_get_merge_bases_many_dirty(struct repository *r,
+				    struct commit *one, int n,
+				    struct commit **twos,
+				    struct commit_list **result);
 
 int get_octopus_merge_bases(struct commit_list *in, struct commit_list **result);
 
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits
  2024-02-28  9:44       ` [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits Johannes Schindelin via GitGitGadget
@ 2024-02-28 20:24         ` Junio C Hamano
  2024-02-28 20:59           ` Dirk Gouders
  2024-02-29  9:53           ` Johannes Schindelin
  0 siblings, 2 replies; 70+ messages in thread
From: Junio C Hamano @ 2024-02-28 20:24 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Patrick Steinhardt, Dirk Gouders, Johannes Schindelin

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> When `git fetch --update-shallow` needs to test for commit ancestry, it
> can naturally run into a missing object (e.g. if it is a parent of a
> shallow commit). For the purpose of `--update-shallow`, this needs to be
> treated as if the child commit did not even have that parent, i.e. the
> commit history needs to be clamped.
>
> For all other scenarios, clamping the commit history is actually a bug,
> as it would hide repository corruption (for an analysis regarding
> shallow and partial clones, see the analysis further down).
>
> Add a flag to optionally ask the function to ignore missing commits, as
> `--update-shallow` needs it to, while detecting missing objects as a
> repository corruption error by default.
>
> This flag is needed, and cannot replaced by `is_repository_shallow()` to
> indicate that situation, because that function would return 0 in the
> `--update-shallow` scenario: There is not actually a `shallow` file in
> that scenario, as demonstrated e.g. by t5537.10 ("add new shallow root
> with receive.updateshallow on") and t5538.4 ("add new shallow root with
> receive.updateshallow on").

Nicely written.

The description above that has been totally revamped reads much much
clearer, at least to me, compared to the previous round.  

Should we declare the topic done and mark it for 'next'?

Thanks.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits
  2024-02-28 20:24         ` Junio C Hamano
@ 2024-02-28 20:59           ` Dirk Gouders
  2024-02-29  9:54             ` Johannes Schindelin
  2024-02-29  9:53           ` Johannes Schindelin
  1 sibling, 1 reply; 70+ messages in thread
From: Dirk Gouders @ 2024-02-28 20:59 UTC (permalink / raw)
  To: Junio C Hamano, Johannes Schindelin
  Cc: Johannes Schindelin via GitGitGadget, git, Patrick Steinhardt

Junio C Hamano <gitster@pobox.com> writes:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>>
>> When `git fetch --update-shallow` needs to test for commit ancestry, it
>> can naturally run into a missing object (e.g. if it is a parent of a
>> shallow commit). For the purpose of `--update-shallow`, this needs to be
>> treated as if the child commit did not even have that parent, i.e. the
>> commit history needs to be clamped.
>>
>> For all other scenarios, clamping the commit history is actually a bug,
>> as it would hide repository corruption (for an analysis regarding
>> shallow and partial clones, see the analysis further down).
>>
>> Add a flag to optionally ask the function to ignore missing commits, as
>> `--update-shallow` needs it to, while detecting missing objects as a
>> repository corruption error by default.
>>
>> This flag is needed, and cannot replaced by `is_repository_shallow()` to
>> indicate that situation, because that function would return 0 in the
>> `--update-shallow` scenario: There is not actually a `shallow` file in
>> that scenario, as demonstrated e.g. by t5537.10 ("add new shallow root
>> with receive.updateshallow on") and t5538.4 ("add new shallow root with
>> receive.updateshallow on").
>
> Nicely written.
>
> The description above that has been totally revamped reads much much
> clearer, at least to me, compared to the previous round.  
>
> Should we declare the topic done and mark it for 'next'?
>
> Thanks.

I agree that this text reads much clearer -- even to me with close to
zero experience, here.

Thank you for taking the time to rewrite the text, Johannes.

Dirk

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits
  2024-02-28 20:24         ` Junio C Hamano
  2024-02-28 20:59           ` Dirk Gouders
@ 2024-02-29  9:53           ` Johannes Schindelin
  1 sibling, 0 replies; 70+ messages in thread
From: Johannes Schindelin @ 2024-02-29  9:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Schindelin via GitGitGadget, git, Patrick Steinhardt,
	Dirk Gouders

Hi Junio,

On Wed, 28 Feb 2024, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > When `git fetch --update-shallow` needs to test for commit ancestry, it
> > can naturally run into a missing object (e.g. if it is a parent of a
> > shallow commit). For the purpose of `--update-shallow`, this needs to be
> > treated as if the child commit did not even have that parent, i.e. the
> > commit history needs to be clamped.
> >
> > For all other scenarios, clamping the commit history is actually a bug,
> > as it would hide repository corruption (for an analysis regarding
> > shallow and partial clones, see the analysis further down).
> >
> > Add a flag to optionally ask the function to ignore missing commits, as
> > `--update-shallow` needs it to, while detecting missing objects as a
> > repository corruption error by default.
> >
> > This flag is needed, and cannot replaced by `is_repository_shallow()` to

Hrmpf. I just spotted the missing "be" between "cannot" and "replaced".

Junio, would you kindly amend the commit message accordingly?

> > indicate that situation, because that function would return 0 in the
> > `--update-shallow` scenario: There is not actually a `shallow` file in
> > that scenario, as demonstrated e.g. by t5537.10 ("add new shallow root
> > with receive.updateshallow on") and t5538.4 ("add new shallow root with
> > receive.updateshallow on").
>
> Nicely written.
>
> The description above that has been totally revamped reads much much
> clearer, at least to me, compared to the previous round.

Thank you!

> Should we declare the topic done and mark it for 'next'?

After you looked over the correctness of the patches, I would be
comfortable with that.

Thanks!
Johannes

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits
  2024-02-28 20:59           ` Dirk Gouders
@ 2024-02-29  9:54             ` Johannes Schindelin
  0 siblings, 0 replies; 70+ messages in thread
From: Johannes Schindelin @ 2024-02-29  9:54 UTC (permalink / raw)
  To: Dirk Gouders
  Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git,
	Patrick Steinhardt

Hi Dirk,

On Wed, 28 Feb 2024, Dirk Gouders wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
> > "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> > writes:
> >
> >> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >>
> >> When `git fetch --update-shallow` needs to test for commit ancestry, it
> >> can naturally run into a missing object (e.g. if it is a parent of a
> >> shallow commit). For the purpose of `--update-shallow`, this needs to be
> >> treated as if the child commit did not even have that parent, i.e. the
> >> commit history needs to be clamped.
> >>
> >> For all other scenarios, clamping the commit history is actually a bug,
> >> as it would hide repository corruption (for an analysis regarding
> >> shallow and partial clones, see the analysis further down).
> >>
> >> Add a flag to optionally ask the function to ignore missing commits, as
> >> `--update-shallow` needs it to, while detecting missing objects as a
> >> repository corruption error by default.
> >>
> >> This flag is needed, and cannot replaced by `is_repository_shallow()` to
> >> indicate that situation, because that function would return 0 in the
> >> `--update-shallow` scenario: There is not actually a `shallow` file in
> >> that scenario, as demonstrated e.g. by t5537.10 ("add new shallow root
> >> with receive.updateshallow on") and t5538.4 ("add new shallow root with
> >> receive.updateshallow on").
> >
> > Nicely written.
> >
> > The description above that has been totally revamped reads much much
> > clearer, at least to me, compared to the previous round.
> >
> > Should we declare the topic done and mark it for 'next'?
> >
> > Thanks.
>
> I agree that this text reads much clearer -- even to me with close to
> zero experience, here.
>
> Thank you for taking the time to rewrite the text, Johannes.

Thank _you_ for taking the time to review the patches and help me with
improving them!

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()`
  2024-02-22 13:21   ` [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
  2024-02-24  0:33     ` Junio C Hamano
@ 2024-03-01  6:56     ` Jeff King
  1 sibling, 0 replies; 70+ messages in thread
From: Jeff King @ 2024-03-01  6:56 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Patrick Steinhardt, Johannes Schindelin

On Thu, Feb 22, 2024 at 01:21:42PM +0000, Johannes Schindelin via GitGitGadget wrote:

> @@ -1402,6 +1436,8 @@ static int merge_mode_and_contents(struct merge_options *opt,
>  							&o->oid,
>  							&a->oid,
>  							&b->oid);
> +			if (result->clean < 0)
> +				return -1;

Coverity flagged this code as NO_EFFECT. The issue is that result->clean
is an unsigned single-bit boolean, so it can never be negative. If we
expand the context a bit, it's

                        result->clean = merge_submodule(opt, &result->blob.oid,
                                                        o->path,
                                                        &o->oid,
                                                        &a->oid,
                                                        &b->oid);
                        if (result->clean < 0)
                                return -1;

So it's possible there's also a problem in the existing assignment, as a
negative return from merge_submodule() would be truncated. But from my
read of it, it will always return either 0 or 1 (if it sees errors from
repo_in_merge_bases(), for example, it will output an error message and
return 0).

So I think we'd want either:

  1. Drop this hunk, and let result->clean stay as a pure boolean.

  2. Propagate errors from merge_submodule() using -1, in which case the
     code needs to be more like:

       int ret = merge_submodule(...);
       if (ret < 0)
               return -1;
       result->clean = !!ret;

I didn't follow the series closely enough to know which would be better.
I guess alternatively it could be a tri-state that carries the error
around (it looks like it is in ort's "struct merge_result"), but that
probably means auditing all of the spots that look at "clean" to see
what they would do with a negative value.

-Peff

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 03/11] commit-reach(repo_in_merge_bases_many): report missing commits
  2024-02-28  9:44       ` [PATCH v4 03/11] commit-reach(repo_in_merge_bases_many): report " Johannes Schindelin via GitGitGadget
@ 2024-03-01  6:58         ` Jeff King
  2024-03-01 16:18           ` Junio C Hamano
  0 siblings, 1 reply; 70+ messages in thread
From: Jeff King @ 2024-03-01  6:58 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Patrick Steinhardt, Dirk Gouders, Johannes Schindelin

On Wed, Feb 28, 2024 at 09:44:09AM +0000, Johannes Schindelin via GitGitGadget wrote:

> @@ -1402,6 +1436,8 @@ static int merge_mode_and_contents(struct merge_options *opt,
>  							&o->oid,
>  							&a->oid,
>  							&b->oid);
> +			if (result->clean < 0)
> +				return -1;

Sorry, I accidentally commented on v2 of your series a moment ago,
rather than the most recent version. But this hunk was untouched between
the two, so the comment still applies:

  https://lore.kernel.org/git/20240301065647.GA2680308@coredump.intra.peff.net/

-Peff

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v4 03/11] commit-reach(repo_in_merge_bases_many): report missing commits
  2024-03-01  6:58         ` Jeff King
@ 2024-03-01 16:18           ` Junio C Hamano
  0 siblings, 0 replies; 70+ messages in thread
From: Junio C Hamano @ 2024-03-01 16:18 UTC (permalink / raw)
  To: Jeff King
  Cc: Johannes Schindelin via GitGitGadget, git, Patrick Steinhardt,
	Dirk Gouders, Johannes Schindelin

Jeff King <peff@peff.net> writes:

> On Wed, Feb 28, 2024 at 09:44:09AM +0000, Johannes Schindelin via GitGitGadget wrote:
>
>> @@ -1402,6 +1436,8 @@ static int merge_mode_and_contents(struct merge_options *opt,
>>  							&o->oid,
>>  							&a->oid,
>>  							&b->oid);
>> +			if (result->clean < 0)
>> +				return -1;
>
> Sorry, I accidentally commented on v2 of your series a moment ago,
> rather than the most recent version. But this hunk was untouched between
> the two, so the comment still applies:
>
>   https://lore.kernel.org/git/20240301065647.GA2680308@coredump.intra.peff.net/
>
> -Peff

Thanks for spotting.

The topic now is in 'next' so let's fix it incrementally while I'll
hold it there.  If all of us thought it has seen enough eyeballs and
is good enough for 'next', yet we later find there was something we
all missed, that is worth a separate explanation, e.g., "The primary
motivation behind the series is still good, but for such and such
reasons we missed this case we are fixing."


^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2024-03-01 16:18 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 01/12] paint_down_to_common: plug a memory leak Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 02/12] Let `repo_in_merge_bases()` report missing commits Johannes Schindelin via GitGitGadget
2024-02-15  9:33   ` Patrick Steinhardt
2024-02-13  8:41 ` [PATCH 03/12] Prepare `repo_in_merge_bases_many()` to optionally expect " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 04/12] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 05/12] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 06/12] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 07/12] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 08/12] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 09/12] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 10/12] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 11/12] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 12/12] paint_down_to_common(): special-case shallow/partial clones Johannes Schindelin via GitGitGadget
2024-02-13 18:37 ` [PATCH 00/12] The merge-base logic vs missing commit objects Junio C Hamano
2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-24  0:33     ` Junio C Hamano
2024-02-26  9:34       ` Johannes Schindelin
2024-02-26 20:01         ` Junio C Hamano
2024-03-01  6:56     ` Jeff King
2024-02-22 13:21   ` [PATCH v2 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 08/11] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 09/11] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 10/11] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 11/11] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
2024-02-27 14:46       ` Dirk Gouders
2024-02-27 15:00         ` Johannes Schindelin
2024-02-27 18:08           ` Junio C Hamano
2024-02-27 18:10             ` Junio C Hamano
2024-02-27 19:07             ` Dirk Gouders
2024-02-27 13:28     ` [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-27 14:56       ` Dirk Gouders
2024-02-27 15:08         ` Johannes Schindelin
2024-02-27 18:24           ` Junio C Hamano
2024-02-27 13:28     ` [PATCH v3 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-27 18:29       ` Junio C Hamano
2024-02-27 13:28     ` [PATCH v3 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 08/11] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 09/11] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 10/11] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 11/11] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44     ` [PATCH v4 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 01/11] commit-reach(paint_down_to_common): plug two memory leaks Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 02/11] commit-reach(repo_in_merge_bases_many): optionally expect missing commits Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 03/11] commit-reach(repo_in_merge_bases_many): report " Johannes Schindelin via GitGitGadget
2024-03-01  6:58         ` Jeff King
2024-03-01 16:18           ` Junio C Hamano
2024-02-28  9:44       ` [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits Johannes Schindelin via GitGitGadget
2024-02-28 20:24         ` Junio C Hamano
2024-02-28 20:59           ` Dirk Gouders
2024-02-29  9:54             ` Johannes Schindelin
2024-02-29  9:53           ` Johannes Schindelin
2024-02-28  9:44       ` [PATCH v4 05/11] commit-reach(paint_down_to_common): start reporting errors Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 06/11] commit-reach(merge_bases_many): pass on "missing commits" errors Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 07/11] commit-reach(get_merge_bases_many_0): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 08/11] commit-reach(repo_get_merge_bases): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 09/11] commit-reach(get_octopus_merge_bases): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 10/11] commit-reach(repo_get_merge_bases_many): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 11/11] commit-reach(repo_get_merge_bases_many_dirty): pass on errors Johannes Schindelin via GitGitGadget

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).