All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, dstolee@microsoft.com
Subject: Re: [PATCH 1/1] commit-graph.c: avoid unnecessary tag dereference when merging
Date: Sat, 21 Mar 2020 11:50:47 -0700	[thread overview]
Message-ID: <xmqqfte1a6ew.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <20200321070333.GB1441446@coredump.intra.peff.net> (Jeff King's message of "Sat, 21 Mar 2020 03:03:33 -0400")

Jeff King <peff@peff.net> writes:

> So we weren't actually reading all of the commits even under the old
> code. We were just going into deref_tag(), seeing that the object is
> already parsed, and then quickly returning when we see that we already
> have an OBJ_COMMIT. I suspect most of your timing differences are mostly
> noise.

Noise that contributes largely to ~7% fluctuation?  That sounds big.

> I guess --input=stdin-commits is a good way to simulate that. Try this
> (assuming there's already a split-graph file with all of the commits in
> it):
>
>   git rev-parse HEAD >input
>   time git commit-graph write --input=stdin-commits --split=merge-all <input
> ...
> A more realistic case would probably be feeding a new small pack to
> --input=stdin-packs.

Makes sense.

> At any rate, I think there is a demonstrable speedup there. But
> moreover, I'm pretty sure this existing code is not doing what it
> expects:
>
>   /* only add commits if they still exist in the repo */
>   result = lookup_commit_reference_gently(ctx->r, &oid, 1);
>
> That won't look at the object database at all if the commit is already
> marked as parsed. And that parsing might have come from the commit graph
> itself, as our earlier attempts showed. So switching to a real
> has_object_file() call is an important _correctness_ fix, even leaving
> aside the speed improvements.

True. has_object_file() asks the question without even
lookup-replace flag set, so it would try to see if the object is
truly there (modulo the "pretended" ones are declared to exist via
the cached-object mechanism).

Do we need to worry about INFO_QUICK and SKIP_FETCH_OBJECT in this
codepath, by the way?

  parent reply	other threads:[~2020-03-21 18:50 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-21  3:44 [PATCH 0/1] commit-graph: avoid unnecessary tag deference when merging Taylor Blau
2020-03-21  3:44 ` [PATCH 1/1] commit-graph.c: avoid unnecessary tag dereference " Taylor Blau
2020-03-21  5:00   ` Jeff King
2020-03-21  6:11     ` Taylor Blau
2020-03-21  6:24       ` Taylor Blau
2020-03-21  7:03       ` Jeff King
2020-03-21 17:27         ` Taylor Blau
2020-03-22  5:36           ` Jeff King
2020-03-22 11:04             ` SZEDER Gábor
2020-03-22 18:45               ` looking up object types quickly, was " Jeff King
2020-03-22 19:18                 ` Jeff King
2020-03-23 20:15               ` Taylor Blau
2020-03-22 16:45             ` Taylor Blau
2020-03-24  6:06               ` Jeff King
2020-03-21 18:50         ` Junio C Hamano [this message]
2020-03-22  0:03           ` Derrick Stolee
2020-03-22  0:20             ` Taylor Blau
2020-03-22  0:23               ` Derrick Stolee
2020-03-22  5:49                 ` Jeff King
2020-03-22  6:04                   ` Jeff King
2020-03-22 15:47                     ` Taylor Blau
2020-03-24  6:11                       ` Jeff King
2020-03-24 23:08                         ` Taylor Blau
2020-03-27  8:42                           ` Jeff King
2020-03-27 15:03                             ` Taylor Blau
2020-03-22 15:44                   ` Taylor Blau
2020-03-24  6:14                     ` Jeff King
2020-03-21  5:01   ` Junio C Hamano
2020-03-21  4:56 ` [PATCH 0/1] commit-graph: avoid unnecessary tag deference " Junio C Hamano
2020-03-21  5:04   ` Jeff King
2020-03-21  6:12     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqfte1a6ew.fsf@gitster.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.