From: Jonathan Tan <jonathantanmy@google.com>
To: git@vger.kernel.org
Cc: Jonathan Tan <jonathantanmy@google.com>,
steadmon@google.com, me@ttaylorr.com, hanxin.hx@bytedance.com
Subject: [PATCH v2 0/2] When fetching, die if in commit graph but not obj db
Date: Thu, 31 Oct 2024 14:18:59 -0700 [thread overview]
Message-ID: <cover.1730409376.git.jonathantanmy@google.com> (raw)
In-Reply-To: <cover.1730235646.git.jonathantanmy@google.com>
Thanks everyone for your review comments. I've updated the patch 1
commit message as Josh requested. I'll reply individually to comments on
patch 2.
Jonathan Tan (2):
Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
fetch-pack: warn if in commit graph but not obj db
fetch-pack.c | 42 +++++++++++-----------
t/t5330-no-lazy-fetch-with-commit-graph.sh | 2 +-
2 files changed, 23 insertions(+), 21 deletions(-)
Range-diff against v1:
1: 4dea8933cf ! 1: 34e87b8388 Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"
@@ Commit message
This reverts commit a6e65fb39caf18259c660c1c7910d5bf80bc15cb.
+ This revert simplifies the next patch in this patch set.
+
The commit message of that commit mentions that the new function "will
be used for the bundle-uri client in a subsequent commit", but it seems
that eventually it wasn't used.
2: 1027ff2cb7 ! 2: 631b9a8677 fetch-pack: warn if in commit graph but not obj db
@@ Commit message
action, because the object is present in the commit graph file but not in
the object DB.
- Therefore, detect when this occurs and print a warning. (Note that
- we cannot proceed to include this object in the list of objects to
- be fetched without changing at least the fetch negotiation code:
- what would happen is that the client will send "want X" and "have X"
- and when I tested at $DAYJOB with a work server that uses JGit, the
- server reasonably returned an empty packfile. And changing the fetch
- negotiation code to only use the object DB when deciding what to report
- as "have" would be an unnecessary slowdown, I think.)
+ Therefore, make it a fatal error when this occurs. (Note that we cannot
+ proceed to include this object in the list of objects to be fetched
+ without changing at least the fetch negotiation code: what would happen
+ is that the client will send "want X" and "have X" and when I tested
+ at $DAYJOB with a work server that uses JGit, the server reasonably
+ returned an empty packfile. And changing the fetch negotiation code to
+ only use the object DB when deciding what to report as "have" would be
+ an unnecessary slowdown, I think.)
This was discovered when a lazy fetch of a missing commit completed with
nothing actually fetched, and the writing of the commit graph file after
every fetch then attempted to read said missing commit, triggering a
lazy fetch of said missing commit, resulting in an infinite loop with no
user-visible indication (until they check the list of processes running
- on their computer). With this fix, at least a warning message will be
- printed. Note that although the repo corruption we discovered was caused
- by a bug in GC in a partial clone, the behavior that this patch teaches
- Git to warn about applies to any repo with commit graph enabled and with
- a missing commit, whether it is a partial clone or not.
+ on their computer). With this fix, there is no infinite loop. Note that
+ although the repo corruption we discovered was caused by a bug in GC in
+ a partial clone, the behavior that this patch teaches Git to warn about
+ applies to any repo with commit graph enabled and with a missing commit,
+ whether it is a partial clone or not.
+
+ t5330, introduced in 3a1ea94a49 (commit-graph.c: no lazy fetch in
+ lookup_commit_in_graph(), 2022-07-01), tests that an interaction between
+ fetch and the commit graph does not cause an infinite loop. This patch
+ changes the exit code in that situation, so that test had to be changed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
## fetch-pack.c ##
-@@ fetch-pack.c: static struct string_list uri_protocols = STRING_LIST_INIT_DUP;
- #define ALTERNATE (1U << 1)
- #define COMMON (1U << 6)
- #define REACH_SCRATCH (1U << 7)
-+#define COMPLETE_FROM_COMMIT_GRAPH (1U << 8)
-
- /*
- * After sending this many "have"s if we do not get any new ACK , we
@@ fetch-pack.c: static void for_each_cached_alternate(struct fetch_negotiator *negotiator,
+ cb(negotiator, cache.items[i]);
}
++static void die_in_commit_graph_only(const struct object_id *oid)
++{
++ die(_("You are attempting to fetch %s, which is in the commit graph file but not in the object database.\n"
++ "This is probably due to repo corruption.\n"
++ "If you are attempting to repair this repo corruption by refetching the missing object, use 'git fetch --refetch' with the missing object."),
++ oid_to_hex(oid));
++}
++
static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
- int mark_tags_complete)
-+ int mark_additional_complete_information)
++ int mark_tags_complete_and_check_obj_db)
{
enum object_type type;
struct object_info info = { .typep = &type };
@@ fetch-pack.c: static void for_each_cached_alternate(struct fetch_negotiator *neg
commit = lookup_commit_in_graph(the_repository, oid);
- if (commit)
+ if (commit) {
-+ if (mark_additional_complete_information)
-+ commit->object.flags |= COMPLETE_FROM_COMMIT_GRAPH;
++ if (mark_tags_complete_and_check_obj_db) {
++ if (!has_object(the_repository, oid, 0))
++ die_in_commit_graph_only(oid);
++ }
return commit;
+ }
@@ fetch-pack.c: static struct commit *deref_without_lazy_fetch(const struct object
if (!tag->tagged)
return NULL;
- if (mark_tags_complete)
-+ if (mark_additional_complete_information)
++ if (mark_tags_complete_and_check_obj_db)
tag->object.flags |= COMPLETE;
oid = &tag->tagged->oid;
} else {
-@@ fetch-pack.c: static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
- save_commit_buffer = old_save_commit_buffer;
- }
-
-+static void warn_in_commit_graph_only(const struct object_id *oid)
-+{
-+ warning(_("You are attempting to fetch %s, which is in the commit graph file but not in the object database."),
-+ oid_to_hex(oid));
-+ warning(_("This is probably due to repo corruption."));
-+ warning(_("If you are attempting to repair this repo corruption by refetching the missing object, use 'git fetch --refetch' with the missing object."));
-+}
-+
- /*
- * Returns 1 if every object pointed to by the given remote refs is available
- * locally and reachable from a local ref, and 0 otherwise.
-@@ fetch-pack.c: static int everything_local(struct fetch_pack_args *args,
- ref->name);
- continue;
- }
-+ if (o->flags & COMPLETE_FROM_COMMIT_GRAPH) {
-+ if (!has_object(the_repository, remote, 0))
-+ warn_in_commit_graph_only(remote);
-+ }
- print_verbose(args, _("already have %s (%s)"), oid_to_hex(remote),
- ref->name);
- }
- ## object.h ##
-@@ object.h: void object_array_init(struct object_array *array);
- /*
- * object flag allocation:
- * revision.h: 0---------10 15 23------27
-- * fetch-pack.c: 01 67
-+ * fetch-pack.c: 01 6-8
- * negotiator/default.c: 2--5
- * walker.c: 0-2
- * upload-pack.c: 4 11-----14 16-----19
+ ## t/t5330-no-lazy-fetch-with-commit-graph.sh ##
+@@ t/t5330-no-lazy-fetch-with-commit-graph.sh: test_expect_success 'fetch any commit from promisor with the usage of the commit
+ test_commit -C with-commit any-commit &&
+ anycommit=$(git -C with-commit rev-parse HEAD) &&
+ GIT_TRACE="$(pwd)/trace.txt" \
+- git -C with-commit-graph fetch origin $anycommit 2>err &&
++ test_must_fail git -C with-commit-graph fetch origin $anycommit 2>err &&
+ ! grep "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
+ grep "git fetch origin" trace.txt >actual &&
+ test_line_count = 1 actual
--
2.47.0.163.g1226f6d8fa-goog
next prev parent reply other threads:[~2024-10-31 21:19 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-03 22:35 [RFC PATCH] promisor-remote: always JIT fetch with --refetch Emily Shaffer
2024-10-06 22:43 ` Junio C Hamano
2024-10-07 0:21 ` Robert Coup
2024-10-07 0:37 ` Junio C Hamano
2024-10-11 16:40 ` Emily Shaffer
2024-10-11 17:54 ` Junio C Hamano
2024-10-23 0:28 ` [PATCH v2] fetch-pack: don't mark COMPLETE unless we have the full object Emily Shaffer
2024-10-23 18:53 ` Emily Shaffer
2024-10-23 20:11 ` Taylor Blau
2024-10-28 22:55 ` Jonathan Tan
2024-10-29 21:11 ` [PATCH 0/2] When fetching, warn if in commit graph but not obj db Jonathan Tan
2024-10-29 21:11 ` [PATCH 1/2] Revert "fetch-pack: add a deref_without_lazy_fetch_extended()" Jonathan Tan
2024-10-30 21:22 ` Josh Steadmon
2024-10-29 21:11 ` [PATCH 2/2] fetch-pack: warn if in commit graph but not obj db Jonathan Tan
2024-10-30 21:22 ` Josh Steadmon
2024-10-31 21:23 ` Jonathan Tan
2024-10-31 20:59 ` Taylor Blau
2024-10-31 21:43 ` Jonathan Tan
2024-11-01 14:33 ` Taylor Blau
2024-11-01 17:33 ` Jonathan Tan
2024-10-30 21:22 ` [PATCH 0/2] When fetching, " Josh Steadmon
2024-10-31 21:18 ` Jonathan Tan [this message]
2024-10-31 21:19 ` [PATCH v2 1/2] Revert "fetch-pack: add a deref_without_lazy_fetch_extended()" Jonathan Tan
2024-10-31 21:19 ` [PATCH v2 2/2] fetch-pack: warn if in commit graph but not obj db Jonathan Tan
2024-11-01 2:22 ` Junio C Hamano
2024-11-01 4:25 ` Junio C Hamano
2024-11-01 8:59 ` [External] " Han Xin
2024-11-01 17:46 ` Jonathan Tan
2024-11-01 17:40 ` Jonathan Tan
2024-11-02 2:08 ` Junio C Hamano
2024-11-01 17:36 ` Jonathan Tan
2024-11-01 15:18 ` Taylor Blau
2024-11-01 17:49 ` Jonathan Tan
2024-10-31 22:33 ` [PATCH v2 0/2] When fetching, die " Josh Steadmon
2024-11-05 19:24 ` [PATCH v3 " Jonathan Tan
2024-11-05 19:24 ` [PATCH v3 1/2] Revert "fetch-pack: add a deref_without_lazy_fetch_extended()" Jonathan Tan
2024-11-05 19:24 ` [PATCH v3 2/2] fetch-pack: die if in commit graph but not obj db Jonathan Tan
2024-11-06 3:12 ` [PATCH v3 0/2] When fetching, " Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1730409376.git.jonathantanmy@google.com \
--to=jonathantanmy@google.com \
--cc=git@vger.kernel.org \
--cc=hanxin.hx@bytedance.com \
--cc=me@ttaylorr.com \
--cc=steadmon@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).