From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Jonathan Tan <jonathantanmy@google.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Junio C Hamano <gitster@pobox.com>,
Derrick Stolee <stolee@gmail.com>,
Stefan Beller <sbeller@google.com>
Subject: Regression in: [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference()
Date: Fri, 25 Jan 2019 16:33:48 +0100 [thread overview]
Message-ID: <20190125153348.GF6702@szeder.dev> (raw)
In-Reply-To: <20181204224238.50966-1-jonathantanmy@google.com>
On Tue, Dec 04, 2018 at 02:42:38PM -0800, Jonathan Tan wrote:
> When fetching into a repository, a connectivity check is first made by
> check_exist_and_connected() in builtin/fetch.c that runs:
>
> git rev-list --objects --stdin --not --all --quiet <(list of objects)
>
> If the client repository has many refs, this command can be slow,
> regardless of the nature of the server repository or what is being
> fetched. A profiler reveals that most of the time is spent in
> setup_revisions() (approx. 60/63), and of the time spent in
> setup_revisions(), most of it is spent in parse_object() (approx.
> 49/60). This is because setup_revisions() parses the target of every ref
> (from "--all"), and parse_object() reads the buffer of the object.
>
> Reading the buffer is unnecessary if the repository has a commit graph
> and if the ref points to a commit (which is typically the case). This
> patch uses the commit graph wherever possible; on my computer, when I
> run the above command with a list of 1 object on a many-ref repository,
> I get a speedup from 1.8s to 1.0s.
>
> Another way to accomplish this effect would be to modify parse_object()
> to use the commit graph if possible; however, I did not want to change
> parse_object()'s current behavior of always checking the object
> signature of the returned object.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> This is on sb/more-repo-in-api because I'm using the repo_parse_commit()
> function.
>
> A colleague noticed this issue when handling a mirror clone.
>
> Looking at the bigger picture, the speed of the connectivity check
> during a fetch might be further improved by passing only the negotiation
> tips (obtained through --negotiation-tip) instead of "--all". This patch
> just handles the low-hanging fruit first.
> ---
I stumbled upon a regression that bisects down to this commit
ec0c5798ee (revision: use commit graph in get_reference(),
2018-12-04):
$ ~/src/git/bin-wrappers/git version
git version 2.19.1.566.gec0c5798ee
$ ~/src/git/bin-wrappers/git commit-graph write --reachable
Computing commit graph generation numbers: 100% (58994/58994), done.
$ ~/src/git/bin-wrappers/git status
HEAD detached at origin/pu
nothing to commit, working tree clean
$ ~/src/git/bin-wrappers/git -c core.commitGraph=false describe --dirty
v2.20.1-833-gcb3b9e7ee3
$ ~/src/git/bin-wrappers/git -c core.commitGraph=true describe --dirty
v2.20.1-833-gcb3b9e7ee3
It's all good with only '--dirty', but watch this with '--all
--dirty':
$ ~/src/git/bin-wrappers/git -c core.commitGraph=false describe --all --dirty
remotes/origin/pu
$ ~/src/git/bin-wrappers/git -c core.commitGraph=true describe --all --dirty
remotes/origin/pu-dirty
IOW if the commit-graph is enabled, then my clean worktree is reported
as dirty.
And to add a cherry on top of my confusion:
$ git checkout v2.20.0
Previous HEAD position was cb3b9e7ee3 Merge branch 'jh/trace2' into pu
HEAD is now at 5d826e9729 Git 2.20
$ ~/src/git/bin-wrappers/git -c core.commitGraph=true describe --all --dirty
tags/v2.20.0
It's clean even with '--all' and commit-graph enabled, but watch this:
$ git branch this-will-screw-it-up
$ ~/src/git/bin-wrappers/git -c core.commitGraph=true describe --all --dirty
tags/v2.20.0-dirty
Have fun! :)
> revision.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/revision.c b/revision.c
> index b5108b75ab..e7da2c57ab 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -212,7 +212,20 @@ static struct object *get_reference(struct rev_info *revs, const char *name,
> {
> struct object *object;
>
> - object = parse_object(revs->repo, oid);
> + /*
> + * If the repository has commit graphs, repo_parse_commit() avoids
> + * reading the object buffer, so use it whenever possible.
> + */
> + if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) {
> + struct commit *c = lookup_commit(revs->repo, oid);
> + if (!repo_parse_commit(revs->repo, c))
> + object = (struct object *) c;
> + else
> + object = NULL;
> + } else {
> + object = parse_object(revs->repo, oid);
> + }
> +
> if (!object) {
> if (revs->ignore_missing)
> return object;
> --
> 2.19.0.271.gfe8321ec05.dirty
>
next prev parent reply other threads:[~2019-01-25 15:33 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-04 22:42 [PATCH on sb/more-repo-in-api] revision: use commit graph in get_reference() Jonathan Tan
2018-12-04 23:12 ` Stefan Beller
2018-12-06 23:36 ` Jonathan Tan
2018-12-07 13:49 ` Derrick Stolee
2018-12-05 4:54 ` Jeff King
2018-12-06 23:54 ` Jonathan Tan
2018-12-07 8:53 ` Jeff King
2018-12-05 23:15 ` Junio C Hamano
2018-12-07 21:50 ` [PATCH on master v2] " Jonathan Tan
2018-12-09 0:51 ` Junio C Hamano
2018-12-09 1:49 ` Junio C Hamano
2018-12-11 10:54 ` Jeff King
2018-12-12 19:58 ` Jonathan Tan
2018-12-13 1:27 ` Jeff King
2018-12-13 16:20 ` Derrick Stolee
2018-12-13 18:54 ` [PATCH v3] " Jonathan Tan
2018-12-14 3:20 ` Junio C Hamano
2018-12-14 8:45 ` Jeff King
2019-01-25 15:33 ` SZEDER Gábor [this message]
2019-01-25 19:56 ` Regression in: [PATCH on sb/more-repo-in-api] " Stefan Beller
2019-01-25 22:01 ` Jonathan Tan
2019-01-25 22:14 ` SZEDER Gábor
2019-01-25 22:21 ` SZEDER Gábor
2019-01-27 13:08 ` [PATCH] object_as_type: initialize commit-graph-related fields of 'struct commit' SZEDER Gábor
2019-01-27 13:28 ` SZEDER Gábor
2019-01-27 18:40 ` Derrick Stolee
2019-01-28 16:15 ` Jeff King
2019-01-28 16:57 ` Jonathan Tan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190125153348.GF6702@szeder.dev \
--to=szeder.dev@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
--cc=sbeller@google.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.