From: Patrick Steinhardt <ps@pks.im>
To: Karthik Nayak <karthik.188@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH v4 3/3] rev-list: add commit object support in `--missing` option
Date: Wed, 25 Oct 2023 08:40:37 +0200 [thread overview]
Message-ID: <ZTi4Zd1by53q5gtM@tanuki> (raw)
In-Reply-To: <20231024122631.158415-4-karthik.188@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 8550 bytes --]
On Tue, Oct 24, 2023 at 02:26:31PM +0200, Karthik Nayak wrote:
> The `--missing` object option in rev-list currently works only with
> missing blobs/trees. For missing commits the revision walker fails with
> a fatal error.
>
> Let's extend the functionality of `--missing` option to also support
> commit objects. This is done by adding a `missing_objects` field to
> `rev_info`. This field is an `oidset` to which we'll add the missing
> commits as we encounter them. The revision walker will now continue the
> traversal and call `show_commit()` even for missing commits. In rev-list
> we can then check if the commit is a missing commit and call the
> existing code for parsing `--missing` objects.
>
> A scenario where this option would be used is to find the boundary
> objects between different object directories. Consider a repository with
> a main object directory (GIT_OBJECT_DIRECTORY) and one or more alternate
> object directories (GIT_ALTERNATE_OBJECT_DIRECTORIES). In such a
> repository, using the `--missing=print` option while disabling the
> alternate object directory allows us to find the boundary objects
> between the main and alternate object directory.
>
> Helped-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
> builtin/rev-list.c | 6 +++
> list-objects.c | 3 ++
> revision.c | 16 +++++++-
> revision.h | 4 ++
> t/t6022-rev-list-missing.sh | 74 +++++++++++++++++++++++++++++++++++++
> 5 files changed, 101 insertions(+), 2 deletions(-)
> create mode 100755 t/t6022-rev-list-missing.sh
>
> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> index 98542e8b3c..37b52520b5 100644
> --- a/builtin/rev-list.c
> +++ b/builtin/rev-list.c
> @@ -149,6 +149,12 @@ static void show_commit(struct commit *commit, void *data)
>
> display_progress(progress, ++progress_counter);
>
> + if (revs->do_not_die_on_missing_objects &&
> + oidset_contains(&revs->missing_objects, &commit->object.oid)) {
> + finish_object__ma(&commit->object);
> + return;
> + }
> +
> if (show_disk_usage)
> total_disk_usage += get_object_disk_usage(&commit->object);
>
> diff --git a/list-objects.c b/list-objects.c
> index 47296dff2f..260089388c 100644
> --- a/list-objects.c
> +++ b/list-objects.c
> @@ -389,6 +389,9 @@ static void do_traverse(struct traversal_context *ctx)
> */
> if (!ctx->revs->tree_objects)
> ; /* do not bother loading tree */
> + else if (ctx->revs->do_not_die_on_missing_objects &&
> + oidset_contains(&ctx->revs->missing_objects, &commit->object.oid))
> + ;
> else if (repo_get_commit_tree(the_repository, commit)) {
> struct tree *tree = repo_get_commit_tree(the_repository,
> commit);
> diff --git a/revision.c b/revision.c
> index 219dc76716..e60646c1a7 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -6,6 +6,7 @@
> #include "object-name.h"
> #include "object-file.h"
> #include "object-store-ll.h"
> +#include "oidset.h"
> #include "tag.h"
> #include "blob.h"
> #include "tree.h"
> @@ -1112,6 +1113,9 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
>
> if (commit->object.flags & ADDED)
> return 0;
> + if (revs->do_not_die_on_missing_objects &&
> + oidset_contains(&revs->missing_objects, &commit->object.oid))
Nit: indentation is off here.
> + return 0;
> commit->object.flags |= ADDED;
>
> if (revs->include_check &&
> @@ -1168,7 +1172,8 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
> for (parent = commit->parents; parent; parent = parent->next) {
> struct commit *p = parent->item;
> int gently = revs->ignore_missing_links ||
> - revs->exclude_promisor_objects;
> + revs->exclude_promisor_objects ||
> + revs->do_not_die_on_missing_objects;
> if (repo_parse_commit_gently(revs->repo, p, gently) < 0) {
> if (revs->exclude_promisor_objects &&
> is_promisor_object(&p->object.oid)) {
> @@ -1176,7 +1181,11 @@ static int process_parents(struct rev_info *revs, struct commit *commit,
> break;
> continue;
> }
> - return -1;
> +
> + if (!revs->do_not_die_on_missing_objects)
> + return -1;
> + else
> + oidset_insert(&revs->missing_objects, &p->object.oid);
> }
> if (revs->sources) {
> char **slot = revision_sources_at(revs->sources, p);
> @@ -3800,6 +3809,9 @@ int prepare_revision_walk(struct rev_info *revs)
> FOR_EACH_OBJECT_PROMISOR_ONLY);
> }
>
> + if (revs->do_not_die_on_missing_objects)
> + oidset_init(&revs->missing_objects, 0);
> +
While we're initializing the new oidset, we never clear it. We should
probably call `oidset_clear()` in `release_revisions()`. And if we
unconditionally initialized the oidset here then we can also call
`oiadset_clear()` unconditionally there, which should be preferable
given that `oidset_init()` does not allocate memory when no initial size
was given.
> if (!revs->reflog_info)
> prepare_to_use_bloom_filter(revs);
> if (!revs->unsorted_input)
> diff --git a/revision.h b/revision.h
> index c73c92ef40..f6bf422f0e 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -4,6 +4,7 @@
> #include "commit.h"
> #include "grep.h"
> #include "notes.h"
> +#include "oidset.h"
> #include "pretty.h"
> #include "diff.h"
> #include "commit-slab-decl.h"
> @@ -373,6 +374,9 @@ struct rev_info {
>
> /* Location where temporary objects for remerge-diff are written. */
> struct tmp_objdir *remerge_objdir;
> +
> + /* Missing objects to be tracked without failing traversal. */
> + struct oidset missing_objects;
As far as I can see we only use this set to track missing commits, but
none of the other objects. The name thus feels a bit misleading to me,
as a reader might rightfully assume that it contains _all_ missing
objects after the revwalk. Should we rename it to `missing_commits` to
clarify?
Patrick
> };
>
> /**
> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
> new file mode 100755
> index 0000000000..40265a4f66
> --- /dev/null
> +++ b/t/t6022-rev-list-missing.sh
> @@ -0,0 +1,74 @@
> +#!/bin/sh
> +
> +test_description='handling of missing objects in rev-list'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
> +. ./test-lib.sh
> +
> +# We setup the repository with two commits, this way HEAD is always
> +# available and we can hide commit 1.
> +test_expect_success 'create repository and alternate directory' '
> + test_commit 1 &&
> + test_commit 2 &&
> + test_commit 3
> +'
> +
> +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> +do
> + test_expect_success "rev-list --missing=error fails with missing object $obj" '
> + oid="$(git rev-parse $obj)" &&
> + path=".git/objects/$(test_oid_to_path $oid)" &&
> +
> + mv "$path" "$path.hidden" &&
> + test_when_finished "mv $path.hidden $path" &&
> +
> + test_must_fail git rev-list --missing=error --objects \
> + --no-object-names HEAD
> + '
> +done
> +
> +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> +do
> + for action in "allow-any" "print"
> + do
> + test_expect_success "rev-list --missing=$action with missing $obj" '
> + oid="$(git rev-parse $obj)" &&
> + path=".git/objects/$(test_oid_to_path $oid)" &&
> +
> + # Before the object is made missing, we use rev-list to
> + # get the expected oids.
> + git rev-list --objects --no-object-names \
> + HEAD ^$obj >expect.raw &&
> +
> + # Blobs are shared by all commits, so evethough a commit/tree
> + # might be skipped, its blob must be accounted for.
> + if [ $obj != "HEAD:1.t" ]; then
> + echo $(git rev-parse HEAD:1.t) >>expect.raw &&
> + echo $(git rev-parse HEAD:2.t) >>expect.raw
> + fi &&
> +
> + mv "$path" "$path.hidden" &&
> + test_when_finished "mv $path.hidden $path" &&
> +
> + git rev-list --missing=$action --objects --no-object-names \
> + HEAD >actual.raw &&
> +
> + # When the action is to print, we should also add the missing
> + # oid to the expect list.
> + case $action in
> + allow-any)
> + ;;
> + print)
> + grep ?$oid actual.raw &&
> + echo ?$oid >>expect.raw
> + ;;
> + esac &&
> +
> + sort actual.raw >actual &&
> + sort expect.raw >expect &&
> + test_cmp expect actual
> + '
> + done
> +done
> +
> +test_done
> --
> 2.42.0
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-10-25 6:40 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-09 10:55 [PATCH 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-09 10:55 ` [PATCH 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-09 10:55 ` [PATCH 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-09 10:55 ` [PATCH 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-09 22:02 ` [PATCH 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-10 6:19 ` Patrick Steinhardt
2023-10-10 17:09 ` Junio C Hamano
2023-10-11 10:37 ` Karthik Nayak
2023-10-11 16:54 ` Junio C Hamano
2023-10-12 10:44 ` Karthik Nayak
2023-10-12 11:04 ` Patrick Steinhardt
2023-10-12 13:23 ` Karthik Nayak
2023-10-12 16:17 ` Junio C Hamano
2023-10-13 5:53 ` Patrick Steinhardt
2023-10-13 8:38 ` Patrick Steinhardt
2023-10-13 12:37 ` [PATCH] commit: detect commits that exist in commit-graph but not in the ODB Patrick Steinhardt
2023-10-13 18:21 ` Junio C Hamano
2023-10-17 6:37 ` Patrick Steinhardt
2023-10-17 18:34 ` Junio C Hamano
2023-10-19 6:45 ` Patrick Steinhardt
2023-10-19 8:25 ` Patrick Steinhardt
2023-10-19 17:16 ` Junio C Hamano
2023-10-20 10:00 ` Jeff King
2023-10-20 17:35 ` Junio C Hamano
2023-10-23 10:15 ` Patrick Steinhardt
2023-10-13 17:07 ` [PATCH 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-12 16:26 ` Junio C Hamano
2023-10-16 10:38 ` [PATCH v2 " Karthik Nayak
2023-10-16 10:38 ` [PATCH v2 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-16 10:38 ` [PATCH v2 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-16 10:38 ` [PATCH v2 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-16 16:24 ` [PATCH v2 0/3] rev-list: add support for commits in `--missing` Junio C Hamano
2023-10-16 19:01 ` Karthik Nayak
2023-10-16 20:33 ` Junio C Hamano
2023-10-19 12:10 ` [PATCH v3 " Karthik Nayak
2023-10-19 12:10 ` [PATCH v3 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-19 12:10 ` [PATCH v3 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-19 12:10 ` [PATCH v3 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-19 22:05 ` Junio C Hamano
2023-10-19 23:35 ` Junio C Hamano
2023-10-20 11:14 ` Karthik Nayak
2023-10-20 14:47 ` Karthik Nayak
2023-10-20 17:45 ` Junio C Hamano
2023-10-20 16:41 ` Junio C Hamano
2023-10-24 11:34 ` Karthik Nayak
2023-10-24 12:26 ` [PATCH v4 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-24 12:26 ` [PATCH v4 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-24 12:26 ` [PATCH v4 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-24 12:26 ` [PATCH v4 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-24 17:45 ` Junio C Hamano
2023-10-25 0:35 ` Junio C Hamano
2023-10-25 9:34 ` Karthik Nayak
2023-10-25 6:40 ` Patrick Steinhardt [this message]
2023-10-26 12:37 ` Junio C Hamano
2023-10-26 10:11 ` [PATCH v5 0/3] rev-list: add support for commits in `--missing` Karthik Nayak
2023-10-26 10:11 ` [PATCH v5 1/3] revision: rename bit to `do_not_die_on_missing_objects` Karthik Nayak
2023-10-26 10:11 ` [PATCH v5 2/3] rev-list: move `show_commit()` to the bottom Karthik Nayak
2023-10-26 10:11 ` [PATCH v5 3/3] rev-list: add commit object support in `--missing` option Karthik Nayak
2023-10-27 6:25 ` Patrick Steinhardt
2023-10-27 7:54 ` Karthik Nayak
2023-10-27 7:59 ` Karthik Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZTi4Zd1by53q5gtM@tanuki \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=karthik.188@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).