git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>, Dirk Gouders <dirk@gouders.net>,
	Johannes Schindelin <johannes.schindelin@gmx.de>
Subject: [PATCH v4 00/11] The merge-base logic vs missing commit objects
Date: Wed, 28 Feb 2024 09:44:06 +0000	[thread overview]
Message-ID: <pull.1657.v4.git.1709113457.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1657.v3.git.1709040497.gitgitgadget@gmail.com>

This patch series is in the same spirit as what I proposed in
https://lore.kernel.org/git/pull.1651.git.1707212981.gitgitgadget@gmail.com/,
where I tackled missing tree objects. This here patch series tackles missing
commit objects instead: Git's merge-base logic handles these missing commit
objects as if there had not been any commit object at all, i.e. if two
commits' merge base commit is missing, they will be treated as if they had
no common commit history at all, which is a bug. Those commit objects should
not be missing, of course, i.e. this is only a problem in corrupt
repositories.

This patch series is a bit more tricky than the "missing tree objects" one,
though: The function signature of quite a few functions need to be changed
to allow callers to discern between "missing commit object" vs "no
merge-base found".

And of course it gets even more tricky because in shallow and partial clones
we expect commit objects to be missing, and that must not be treated like an
error but the existing logic is actually desirable in those scenarios.

I am deeply sorry both about the length of this patch series as well as
having to lean so heavily on reviews on the Git mailing list.

Changes since v3:

 * Reworded some hard-to-understand paragraphs in the commit message of
   "Prepare paint_down_to_common() for handling shallow commits" (4/11).
 * Changed an inconsistent paint_down_to_common() < 0 to simply test whether
   the return value is non-zero.
 * Changed all commit messages to have proper <area>: prefixes.

Changes since v2:

 * Moved a hunk from 3/11 to 2/11 that lets the repo_in_merge_bases_many()
   function return an error if non-existing commits have been passed to it,
   unless the ignore_missing_commits parameter is non-zero.
 * The end result is tree-same.

Changes since v1:

 * Addressed a lot of memory leaks.
 * Reordered patch 2 and 3 to render the commit message's comment about
   unchanged behavior true.
 * Fixed the incorrectly converted condition in the merge_submodule()
   function.
 * The last patch ("paint_down_to_common(): special-case shallow/partial
   clones") was dropped because it is not actually necessary, and the
   explanation for that was added to the commit message of "Prepare
   paint_down_to_common() for handling shallow commits".
 * An inconsistently-named variable i was renamed to be consistent with the
   other variables called ret.

Johannes Schindelin (11):
  commit-reach(paint_down_to_common): plug two memory leaks
  commit-reach(repo_in_merge_bases_many): optionally expect missing
    commits
  commit-reach(repo_in_merge_bases_many): report missing commits
  commit-reach(paint_down_to_common): prepare for handling shallow
    commits
  commit-reach(paint_down_to_common): start reporting errors
  commit-reach(merge_bases_many): pass on "missing commits" errors
  commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
  commit-reach(repo_get_merge_bases): pass on "missing commits" errors
  commit-reach(get_octopus_merge_bases): pass on "missing commits"
    errors
  commit-reach(repo_get_merge_bases_many): pass on "missing commits"
    errors
  commit-reach(repo_get_merge_bases_many_dirty): pass on errors

 bisect.c                         |   7 +-
 builtin/branch.c                 |  12 +-
 builtin/fast-import.c            |   6 +-
 builtin/fetch.c                  |   2 +
 builtin/log.c                    |  30 +++--
 builtin/merge-base.c             |  23 +++-
 builtin/merge-tree.c             |   5 +-
 builtin/merge.c                  |  26 ++--
 builtin/pull.c                   |   9 +-
 builtin/rebase.c                 |   8 +-
 builtin/receive-pack.c           |   6 +-
 builtin/rev-parse.c              |   5 +-
 commit-reach.c                   | 209 ++++++++++++++++++++-----------
 commit-reach.h                   |  26 ++--
 commit.c                         |   7 +-
 diff-lib.c                       |   5 +-
 http-push.c                      |   5 +-
 log-tree.c                       |   5 +-
 merge-ort.c                      |  87 +++++++++++--
 merge-recursive.c                |  58 +++++++--
 notes-merge.c                    |   3 +-
 object-name.c                    |   7 +-
 remote.c                         |   2 +-
 revision.c                       |  12 +-
 sequencer.c                      |   8 +-
 shallow.c                        |  21 ++--
 submodule.c                      |   7 +-
 t/helper/test-reach.c            |  11 +-
 t/t4301-merge-tree-write-tree.sh |  12 ++
 29 files changed, 441 insertions(+), 183 deletions(-)


base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1657%2Fdscho%2Fmerge-base-and-missing-objects-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1657/dscho/merge-base-and-missing-objects-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1657

Range-diff vs v3:

  1:  6e4e409cd43 !  1:  647fa2ed5c5 paint_down_to_common: plug two memory leaks
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    paint_down_to_common: plug two memory leaks
     +    commit-reach(paint_down_to_common): plug two memory leaks
      
          When a commit is missing, we return early (currently pretending that no
          merge basis could be found in that case). At that stage, it is possible
  2:  f68d77c6123 !  2:  48e69bf7229 Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    Prepare `repo_in_merge_bases_many()` to optionally expect missing commits
     +    commit-reach(repo_in_merge_bases_many): optionally expect missing commits
      
          Currently this function treats unrelated commit histories the same way
          as commit histories with missing commit objects.
  3:  0aaa224b5db !  3:  1938b317a49 Start reporting missing commits in `repo_in_merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    Start reporting missing commits in `repo_in_merge_bases_many()`
     +    commit-reach(repo_in_merge_bases_many): report missing commits
      
          Some functions in Git's source code follow the convention that returning
          a negative value indicates a fatal error, e.g. repository corruption.
  4:  84e7fbc07e0 !  4:  837aa5a89c6 Prepare `paint_down_to_common()` for handling shallow commits
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    Prepare `paint_down_to_common()` for handling shallow commits
     +    commit-reach(paint_down_to_common): prepare for handling shallow commits
      
          When `git fetch --update-shallow` needs to test for commit ancestry, it
          can naturally run into a missing object (e.g. if it is a parent of a
     -    shallow commit). To accommodate, the merge base logic will need to be
     -    able to handle this situation gracefully.
     +    shallow commit). For the purpose of `--update-shallow`, this needs to be
     +    treated as if the child commit did not even have that parent, i.e. the
     +    commit history needs to be clamped.
      
     -    Currently, that logic pretends that a missing parent commit is
     -    equivalent to a missing parent commit, and for the purpose of
     -    `--update-shallow` that is exactly what we need it to do.
     +    For all other scenarios, clamping the commit history is actually a bug,
     +    as it would hide repository corruption (for an analysis regarding
     +    shallow and partial clones, see the analysis further down).
      
     -    Therefore, let's introduce a flag to indicate when that is precisely the
     -    logic we want.
     +    Add a flag to optionally ask the function to ignore missing commits, as
     +    `--update-shallow` needs it to, while detecting missing objects as a
     +    repository corruption error by default.
      
     -    We need a flag, and cannot rely on `is_repository_shallow()` to indicate
     -    that situation, because that function would return 0: There may not
     -    actually be a `shallow` file, as demonstrated e.g. by t5537.10 ("add new
     -    shallow root with receive.updateshallow on") and t5538.4 ("add new
     -    shallow root with receive.updateshallow on").
     +    This flag is needed, and cannot replaced by `is_repository_shallow()` to
     +    indicate that situation, because that function would return 0 in the
     +    `--update-shallow` scenario: There is not actually a `shallow` file in
     +    that scenario, as demonstrated e.g. by t5537.10 ("add new shallow root
     +    with receive.updateshallow on") and t5538.4 ("add new shallow root with
     +    receive.updateshallow on").
      
          Note: shallow commits' parents are set to `NULL` internally already,
          therefore there is no need to special-case shallow repositories here, as
  5:  85332b58c37 !  5:  b978b5d233a commit-reach: start reporting errors in `paint_down_to_common()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    commit-reach: start reporting errors in `paint_down_to_common()`
     +    commit-reach(paint_down_to_common): start reporting errors
      
          If a commit cannot be parsed, it is currently ignored when looking for
          merge bases. That's undesirable as the operation can pretend success in
     @@ commit-reach.c: static struct commit_list *merge_bases_many(struct repository *r
       	}
       
      -	list = paint_down_to_common(r, one, n, twos, 0, 0);
     -+	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
     ++	if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
      +		free_commit_list(list);
      +		return NULL;
      +	}
  6:  2ae6a54dd59 !  6:  0f1ce130ce6 merge_bases_many(): pass on errors from `paint_down_to_common()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    merge_bases_many(): pass on errors from `paint_down_to_common()`
     +    commit-reach(merge_bases_many): pass on "missing commits" errors
      
          The `paint_down_to_common()` function was just taught to indicate
          parsing errors, and now the `merge_bases_many()` function is aware of
     @@ commit-reach.c: static int paint_down_to_common(struct repository *r,
      +				     oid_to_hex(&twos[i]->object.oid));
       	}
       
     - 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list) < 0) {
     + 	if (paint_down_to_common(r, one, n, twos, 0, 0, &list)) {
       		free_commit_list(list);
      -		return NULL;
      +		return -1;
  7:  4321795102d !  7:  133b69b6a62 get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    get_merge_bases_many_0(): pass on errors from `merge_bases_many()`
     +    commit-reach(get_merge_bases_many_0): pass on "missing commits" errors
      
          The `merge_bases_many()` function was just taught to indicate
          parsing errors, and now the `get_merge_bases_many_0()` function is aware
  8:  35545c4b777 !  8:  bd52f258cd9 repo_get_merge_bases(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    repo_get_merge_bases(): pass on errors from `merge_bases_many()`
     +    commit-reach(repo_get_merge_bases): pass on "missing commits" errors
      
          The `merge_bases_many()` function was just taught to indicate parsing
          errors, and now the `repo_get_merge_bases()` function (which is also
  9:  a963058d2ba !  9:  b7ef90a57f0 get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    get_octopus_merge_bases(): pass on errors from `merge_bases_many()`
     +    commit-reach(get_octopus_merge_bases): pass on "missing commits" errors
      
          The `merge_bases_many()` function was just taught to indicate parsing
          errors, and now the `repo_get_merge_bases()` function (which is also
 10:  c3bed9c8500 ! 10:  32587b3caa7 repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    repo_get_merge_bases_many(): pass on errors from `merge_bases_many()`
     +    commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors
      
          The `merge_bases_many()` function was just taught to indicate parsing
          errors, and now the `repo_get_merge_bases_many()` function is aware of
 11:  bdbf47ae505 ! 11:  05de9f24444 repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`
     @@ Metadata
      Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
      
       ## Commit message ##
     -    repo_get_merge_bases_many_dirty(): pass on errors from `merge_bases_many()`
     +    commit-reach(repo_get_merge_bases_many_dirty): pass on errors
     +
     +    (Actually, this commit is only about passing on "missing commits"
     +    errors, but adding that to the commit's title would have made it too
     +    long.)
      
          The `merge_bases_many()` function was just taught to indicate parsing
          errors, and now the `repo_get_merge_bases_many_dirty()` function is

-- 
gitgitgadget

  parent reply	other threads:[~2024-02-28  9:44 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13  8:41 [PATCH 00/12] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 01/12] paint_down_to_common: plug a memory leak Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 02/12] Let `repo_in_merge_bases()` report missing commits Johannes Schindelin via GitGitGadget
2024-02-15  9:33   ` Patrick Steinhardt
2024-02-13  8:41 ` [PATCH 03/12] Prepare `repo_in_merge_bases_many()` to optionally expect " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 04/12] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 05/12] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 06/12] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 07/12] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 08/12] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 09/12] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 10/12] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 11/12] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
2024-02-13  8:41 ` [PATCH 12/12] paint_down_to_common(): special-case shallow/partial clones Johannes Schindelin via GitGitGadget
2024-02-13 18:37 ` [PATCH 00/12] The merge-base logic vs missing commit objects Junio C Hamano
2024-02-22 13:21 ` [PATCH v2 00/11] " Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-24  0:33     ` Junio C Hamano
2024-02-26  9:34       ` Johannes Schindelin
2024-02-26 20:01         ` Junio C Hamano
2024-03-01  6:56     ` Jeff King
2024-02-22 13:21   ` [PATCH v2 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 08/11] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 09/11] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 10/11] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
2024-02-22 13:21   ` [PATCH v2 11/11] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
2024-02-27 13:28   ` [PATCH v3 00/11] The merge-base logic vs missing commit objects Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 01/11] paint_down_to_common: plug two memory leaks Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 02/11] Prepare `repo_in_merge_bases_many()` to optionally expect missing commits Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 03/11] Start reporting missing commits in `repo_in_merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 04/11] Prepare `paint_down_to_common()` for handling shallow commits Johannes Schindelin via GitGitGadget
2024-02-27 14:46       ` Dirk Gouders
2024-02-27 15:00         ` Johannes Schindelin
2024-02-27 18:08           ` Junio C Hamano
2024-02-27 18:10             ` Junio C Hamano
2024-02-27 19:07             ` Dirk Gouders
2024-02-27 13:28     ` [PATCH v3 05/11] commit-reach: start reporting errors in `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-27 14:56       ` Dirk Gouders
2024-02-27 15:08         ` Johannes Schindelin
2024-02-27 18:24           ` Junio C Hamano
2024-02-27 13:28     ` [PATCH v3 06/11] merge_bases_many(): pass on errors from `paint_down_to_common()` Johannes Schindelin via GitGitGadget
2024-02-27 18:29       ` Junio C Hamano
2024-02-27 13:28     ` [PATCH v3 07/11] get_merge_bases_many_0(): pass on errors from `merge_bases_many()` Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 08/11] repo_get_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 09/11] get_octopus_merge_bases(): " Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 10/11] repo_get_merge_bases_many(): " Johannes Schindelin via GitGitGadget
2024-02-27 13:28     ` [PATCH v3 11/11] repo_get_merge_bases_many_dirty(): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44     ` Johannes Schindelin via GitGitGadget [this message]
2024-02-28  9:44       ` [PATCH v4 01/11] commit-reach(paint_down_to_common): plug two memory leaks Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 02/11] commit-reach(repo_in_merge_bases_many): optionally expect missing commits Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 03/11] commit-reach(repo_in_merge_bases_many): report " Johannes Schindelin via GitGitGadget
2024-03-01  6:58         ` Jeff King
2024-03-01 16:18           ` Junio C Hamano
2024-02-28  9:44       ` [PATCH v4 04/11] commit-reach(paint_down_to_common): prepare for handling shallow commits Johannes Schindelin via GitGitGadget
2024-02-28 20:24         ` Junio C Hamano
2024-02-28 20:59           ` Dirk Gouders
2024-02-29  9:54             ` Johannes Schindelin
2024-02-29  9:53           ` Johannes Schindelin
2024-02-28  9:44       ` [PATCH v4 05/11] commit-reach(paint_down_to_common): start reporting errors Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 06/11] commit-reach(merge_bases_many): pass on "missing commits" errors Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 07/11] commit-reach(get_merge_bases_many_0): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 08/11] commit-reach(repo_get_merge_bases): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 09/11] commit-reach(get_octopus_merge_bases): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 10/11] commit-reach(repo_get_merge_bases_many): " Johannes Schindelin via GitGitGadget
2024-02-28  9:44       ` [PATCH v4 11/11] commit-reach(repo_get_merge_bases_many_dirty): pass on errors Johannes Schindelin via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1657.v4.git.1709113457.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=dirk@gouders.net \
    --cc=git@vger.kernel.org \
    --cc=johannes.schindelin@gmx.de \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).