git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Junio C Hamano <gitster@pobox.com>
Cc: Karthik Nayak <karthik.188@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH 0/3] rev-list: add support for commits in `--missing`
Date: Fri, 13 Oct 2023 10:38:49 +0200	[thread overview]
Message-ID: <ZSkCGS3JPEQ71dOF@tanuki> (raw)
In-Reply-To: <ZSjbYCXfSUtEIkAt@tanuki>

[-- Attachment #1: Type: text/plain, Size: 3996 bytes --]

On Fri, Oct 13, 2023 at 07:53:36AM +0200, Patrick Steinhardt wrote:
> On Thu, Oct 12, 2023 at 09:17:09AM -0700, Junio C Hamano wrote:
> > Patrick Steinhardt <ps@pks.im> writes:
[snip]
> > My assumption also has been that there is no point in running
> > "rev-list --missing" if we know there is no repository corruption,
> > and those who run "rev-list --missing" wants to know if the objects
> > are really available, i.e. even if commit-graph that is out of sync
> > with reality says it exists, if it is not in the object store, they
> > would want to know that.
> > 
> > If you can show me that it is not the case, then I may be pursuaded
> > why producing a result that is out of sync with reality _quickly_,
> > instead of taking time to produce a result that matches reality, is
> > a worthy "optimization" to keep.
> 
> Note that I'm not saying that it's fine to return wrong results -- this
> is of course a bug that needs to be addressed somehow. After all, things
> working correctly should always trump things working fast. But until now
> it felt more like we were going into the direction of disabling commit
> graphs without checking whether there is an alternative solution that
> allows us to get the best of both worlds, correctness and performance.
> 
> So what I'm looking for in this thread is a reason why we _can't_ have
> that, or at least can't have it without unreasonable amounts of work. We
> have helpers like `lookup_commit_in_graph()` that are designed to detect
> stale commit graphs by double-checking whether a commit that has been
> looked up via the commit graph actually exists in the repository. So I'm
> wondering whether this could help us address the issue.
> 
> If there is a good reason why all of that is not possible then I'm happy
> to carve in.

I've had a quick look at this problem so that I can solidify my own
train of thought a bit. The issue is `repo_parse_commit_internal()`,
which calls `parse_commit_in_graph()` without verifying that the object
actually exists in the object database. It's the only callsite of that
function outside of "commit-graph.c", as all other external callers
would call `lookup_commit_in_graph()` which _does_ perform the object
existence check.

So I think that the proper way to address the regression would be a
patch similar to the following:

diff --git a/commit.c b/commit.c
index b3223478bc..109e9217e3 100644
--- a/commit.c
+++ b/commit.c
@@ -572,8 +572,13 @@ int repo_parse_commit_internal(struct repository *r,
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	if (use_commit_graph && parse_commit_in_graph(r, item))
+	if (use_commit_graph && parse_commit_in_graph(r, item)) {
+		if (!has_object(r, &item->object.oid, 0))
+			return quiet_on_missing ? -1 :
+				error(_("commit %s exists in commit-graph but not in the object database"),
+				      oid_to_hex(&item->object.oid));
 		return 0;
+	}
 
 	if (oid_object_info_extended(r, &item->object.oid, &oi, flags) < 0)
 		return quiet_on_missing ? -1 :

I wouldn't be surprised if there are other edge cases where this can
lead to buggy behaviour.

Also, this issue may not necessarily stem from repository corruption. It
could for example happen that commits are getting garbage collected
without the commit graph having been updated, whatever the reason may be
for this. In that case we would happily continue to return these commits
from the commit graph even though the underlying object has since been
deleted. The repository itself is not corrupt though, we merely look at
an out-of-date commit graph. And for what it's worth, I think that we
should always gracefully handle that case or otherwise the commit graph
becomes less useful overall.

I didn't dig much deeper yet. And while the above patch fixes some of
the test failures, it doesn't fix them all. If we agree that this is the
way to go then I'd be happy to turn this into a proper patch.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-10-13  8:38 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-09 10:55 [PATCH 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-09 10:55 ` [PATCH 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-09 10:55 ` [PATCH 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-09 10:55 ` [PATCH 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-09 22:02 ` [PATCH 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-10  6:19 ` Patrick Steinhardt
2023-10-10 17:09   ` Junio C Hamano
2023-10-11 10:37     ` Karthik Nayak
2023-10-11 16:54       ` Junio C Hamano
2023-10-12 10:44         ` Karthik Nayak
2023-10-12 11:04           ` Patrick Steinhardt
2023-10-12 13:23             ` Karthik Nayak
2023-10-12 16:17             ` Junio C Hamano
2023-10-13  5:53               ` Patrick Steinhardt
2023-10-13  8:38                 ` Patrick Steinhardt [this message]
2023-10-13 12:37                   ` [PATCH] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-13 18:21                     ` Junio C Hamano
2023-10-17  6:37                       ` Patrick Steinhardt
2023-10-17 18:34                         ` Junio C Hamano
2023-10-19  6:45                           ` Patrick Steinhardt
2023-10-19  8:25                             ` Patrick Steinhardt
2023-10-19 17:16                               ` Junio C Hamano
2023-10-20 10:00                                 ` Jeff King
2023-10-20 17:35                                   ` Junio C Hamano
2023-10-23 10:15                                   ` Patrick Steinhardt
2023-10-13 17:07                   ` [PATCH 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-12 16:26           ` Junio C Hamano
2023-10-16 10:38 ` [PATCH v2 " Karthik Nayak
2023-10-16 10:38   ` [PATCH v2 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-16 10:38   ` [PATCH v2 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-16 10:38   ` [PATCH v2 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-16 16:24   ` [PATCH v2 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-16 19:01     ` Karthik Nayak
2023-10-16 20:33       ` Junio C Hamano
2023-10-19 12:10   ` [PATCH v3 " Karthik Nayak
2023-10-19 12:10     ` [PATCH v3 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-19 12:10     ` [PATCH v3 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-19 12:10     ` [PATCH v3 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-19 22:05       ` Junio C Hamano
2023-10-19 23:35         ` Junio C Hamano
2023-10-20 11:14           ` Karthik Nayak
2023-10-20 14:47             ` Karthik Nayak
2023-10-20 17:45               ` Junio C Hamano
2023-10-20 16:41           ` Junio C Hamano
2023-10-24 11:34             ` Karthik Nayak
2023-10-24 12:26     ` [PATCH v4 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-24 12:26       ` [PATCH v4 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-24 12:26       ` [PATCH v4 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-24 12:26       ` [PATCH v4 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-24 17:45         ` Junio C Hamano
2023-10-25  0:35           ` Junio C Hamano
2023-10-25  9:34           ` Karthik Nayak
2023-10-25  6:40         ` Patrick Steinhardt
2023-10-26 12:37           ` Junio C Hamano
2023-10-26 10:11       ` [PATCH v5 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-26 10:11         ` [PATCH v5 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-26 10:11         ` [PATCH v5 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-26 10:11         ` [PATCH v5 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-27  6:25           ` Patrick Steinhardt
2023-10-27  7:54             ` Karthik Nayak
2023-10-27  7:59             ` Karthik Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSkCGS3JPEQ71dOF@tanuki \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=karthik.188@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).