All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, "Jeff King" <peff@peff.net>,
	"Felipe Contreras" <felipe.contreras@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Chris Torek" <chris.torek@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Taylor Blau" <me@ttaylorr.com>
Subject: Re: [PATCH v3 4/4] revision: avoid hitting packfiles when commits are in commit-graph
Date: Mon, 02 Aug 2021 13:01:03 -0700	[thread overview]
Message-ID: <xmqqwnp3vcow.fsf@gitster.g> (raw)
In-Reply-To: <f6fc2a5e6d94befa915fb59b6296ce3153820c13.1627896460.git.ps@pks.im> (Patrick Steinhardt's message of "Mon, 2 Aug 2021 11:38:15 +0200")

Patrick Steinhardt <ps@pks.im> writes:

> diff --git a/commit-graph.c b/commit-graph.c
> index 3860a0d847..a81d5cebc0 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -864,6 +864,48 @@ static int fill_commit_in_graph(struct repository *r,
>  	return 1;
>  }


Describe return-value here.  0 for not-found, !0 for found?

> +static int find_object_id_in_graph(const struct object_id *id, struct commit_graph *g, uint32_t *pos)
> +{
> +	struct commit_graph *cur_g = g;
> +	uint32_t lex_index;
> +
> +	while (cur_g && !bsearch_graph(cur_g, (struct object_id *)id, &lex_index))
> +		cur_g = cur_g->base_graph;
> +
> +	if (cur_g) {
> +		*pos = lex_index + cur_g->num_commits_in_base;
> +		return 1;
> +	}
> +
> +	return 0;
> +}

Likewise, or as this is public, perhaps in commit-graph.h next to
its declaration.

> +int find_object_in_graph(struct repository *repo, struct object *object)
> +{
> +	struct commit *commit;
> +	uint32_t pos;
> +
> +	if (object->parsed) {
> +		if (object->type != OBJ_COMMIT)
> +			return -1;
> +		return 0;

This is puzzling---at least it is not consistent with what the
function name says ("please say if you find _this_ object in the
commit-graph file"---if that is not what this function does, it
needs a comment before the implementation).

The caller had object and we has already been parsed.  If the
function were "with help from commit-graph, please tell me if you
can positively say this is a commit", the above is understandable.
If we know positively that it is not commit, we say "no, it is not a
commit" (which may be suboptimal---if the caller falls back to
another codepath, the object will still not be a commit) and if we
know it is a commit, we can say "yes, it definitely is a commit" and
the caller can stop there.

I guess my only problem with this function is that its name and what
it does does not align.  If the caller uses it for the real purpose
of the function I guessed, then the logic itself may be OK.

> +	}
> +
> +	if (!repo->objects->commit_graph)
> +		return -1;

There is no commit-graph, then we decline to make a decision, which
makes sense.

> +	if (!find_object_id_in_graph(&object->oid, repo->objects->commit_graph, &pos))
> +		return -1;

If it does not exist in the graph, we cannot tell, either.

> +	commit = object_as_type(object, OBJ_COMMIT, 1);
> +	if (!commit)
> +		return -1;
> +	if (!fill_commit_in_graph(repo, commit, repo->objects->commit_graph, pos))
> +		return -1;
> +
> +	return 0;
> +}
> +
>  static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uint32_t *pos)
>  {
>  	uint32_t graph_pos = commit_graph_position(item);
> @@ -871,18 +913,7 @@ static int find_commit_in_graph(struct commit *item, struct commit_graph *g, uin
>  		*pos = graph_pos;
>  		return 1;
>  	} else {
> -		struct commit_graph *cur_g = g;
> -		uint32_t lex_index;
> -
> -		while (cur_g && !bsearch_graph(cur_g, &(item->object.oid), &lex_index))
> -			cur_g = cur_g->base_graph;
> -
> -		if (cur_g) {
> -			*pos = lex_index + cur_g->num_commits_in_base;
> -			return 1;
> -		}
> -
> -		return 0;
> +		return find_object_id_in_graph(&item->object.oid, g, pos);

And I think this one is a op-op refactoring that does not change the
behaviour of find_commit_in_graph()?  It might be easier if done in
a separate preparatory step, but it is small enough.

> diff --git a/commit-graph.h b/commit-graph.h
> index 96c24fb577..f373fab4c0 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -139,6 +139,8 @@ int write_commit_graph(struct object_directory *odb,
>  		       enum commit_graph_write_flags flags,
>  		       const struct commit_graph_opts *opts);
>  
> +int find_object_in_graph(struct repository *repo, struct object *object);
> +
>  #define COMMIT_GRAPH_VERIFY_SHALLOW	(1 << 0)
>  
>  int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags);
> diff --git a/revision.c b/revision.c
> index 671b6d6513..c3f9cf2998 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -362,10 +362,12 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
>  	struct object *object = lookup_unknown_object(revs->repo, oid);
>  
>  	if (object->type == OBJ_NONE) {
> -		int type = oid_object_info(revs->repo, oid, NULL);
> -		if (type < 0 || !object_as_type(object, type, 1)) {
> -			object = NULL;
> -			goto out;
> +		if (find_object_in_graph(revs->repo, object) < 0) {
> +			int type = oid_object_info(revs->repo, oid, NULL);
> +			if (type < 0 || !object_as_type(object, type, 1)) {
> +				object = NULL;
> +				goto out;
> +			}
>  		}
>  	}

  reply	other threads:[~2021-08-02 20:01 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-28  5:33 [PATCH v2 0/3] Speed up connectivity checks via bitmaps Patrick Steinhardt
2021-06-28  5:33 ` [PATCH v2 1/3] p5400: add perf tests for git-receive-pack(1) Patrick Steinhardt
2021-06-28  7:49   ` Ævar Arnfjörð Bjarmason
2021-06-29  6:18     ` Patrick Steinhardt
2021-06-29 12:09       ` Ævar Arnfjörð Bjarmason
2021-06-28  5:33 ` [PATCH v2 2/3] receive-pack: skip connectivity checks on delete-only commands Patrick Steinhardt
2021-06-28  8:00   ` Ævar Arnfjörð Bjarmason
2021-06-28  8:06     ` Ævar Arnfjörð Bjarmason
2021-06-29  6:26     ` Patrick Steinhardt
2021-06-30  1:31   ` Jeff King
2021-06-30  1:35     ` Jeff King
2021-06-30 13:52     ` Patrick Steinhardt
2021-06-28  5:33 ` [PATCH v2 3/3] connected: implement connectivity check using bitmaps Patrick Steinhardt
2021-06-28 20:23   ` Taylor Blau
2021-06-29 22:44     ` Taylor Blau
2021-06-30  2:04       ` Jeff King
2021-06-30  3:07         ` Taylor Blau
2021-06-30  5:45           ` Jeff King
2021-07-02 17:44             ` Taylor Blau
2021-07-02 21:21               ` Jeff King
2021-06-30  1:51   ` Jeff King
2021-07-20 14:26     ` Patrick Steinhardt
2021-08-02  9:37 ` [PATCH v3 0/4] Speed up connectivity checks Patrick Steinhardt
2021-08-02  9:38   ` [PATCH v3 1/4] connected: do not sort input revisions Patrick Steinhardt
2021-08-02 12:49     ` Ævar Arnfjörð Bjarmason
2021-08-03  8:50       ` Patrick Steinhardt
2021-08-04 11:01         ` Ævar Arnfjörð Bjarmason
2021-08-02 19:00     ` Junio C Hamano
2021-08-03  8:55       ` Patrick Steinhardt
2021-08-03 21:47         ` Junio C Hamano
2021-08-02  9:38   ` [PATCH v3 2/4] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-02 12:53     ` Ævar Arnfjörð Bjarmason
2021-08-02  9:38   ` [PATCH v3 3/4] revision: avoid loading object headers multiple times Patrick Steinhardt
2021-08-02 12:55     ` Ævar Arnfjörð Bjarmason
2021-08-05 10:12       ` Patrick Steinhardt
2021-08-02 19:40     ` Junio C Hamano
2021-08-03  9:07       ` Patrick Steinhardt
2021-08-06 14:17         ` Patrick Steinhardt
2021-08-02  9:38   ` [PATCH v3 4/4] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt
2021-08-02 20:01     ` Junio C Hamano [this message]
2021-08-03  9:16       ` Patrick Steinhardt
2021-08-03 21:56         ` Junio C Hamano
2021-08-05 11:01           ` Patrick Steinhardt
2021-08-05 16:16             ` Junio C Hamano
2021-08-04 10:51         ` Ævar Arnfjörð Bjarmason
2021-08-05 11:25   ` [PATCH v4 0/6] Speed up connectivity checks Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 1/6] revision: separate walk and unsorted flags Patrick Steinhardt
2021-08-05 18:47       ` Junio C Hamano
2021-08-05 11:25     ` [PATCH v4 2/6] connected: do not sort input revisions Patrick Steinhardt
2021-08-05 18:44       ` Junio C Hamano
2021-08-06  6:00         ` Patrick Steinhardt
2021-08-06 16:50           ` Junio C Hamano
2021-08-05 11:25     ` [PATCH v4 3/6] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 4/6] revision: avoid loading object headers multiple times Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 5/6] commit-graph: split out function to search commit position Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 6/6] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt
2021-08-09  8:00 ` [PATCH v5 0/5] Speed up connectivity checks Patrick Steinhardt
2021-08-09  8:02   ` Patrick Steinhardt
2021-08-09  8:11 ` Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 1/5] revision: separate walk and unsorted flags Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 2/5] connected: do not sort input revisions Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 3/5] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 4/5] commit-graph: split out function to search commit position Patrick Steinhardt
2021-08-09  8:12   ` [PATCH v5 5/5] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqwnp3vcow.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=avarab@gmail.com \
    --cc=chris.torek@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.