From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 8/9] commit-graph: replace packed_oid_list with oid_array
Date: Fri, 4 Dec 2020 14:14:44 -0500 [thread overview]
Message-ID: <X8qKpL+mIFbjngwl@nand.local> (raw)
In-Reply-To: <X8qGqHvHbJQL9B22@coredump.intra.peff.net>
On Fri, Dec 04, 2020 at 01:57:44PM -0500, Jeff King wrote:
> Our custom packed_oid_list data structure is really just an oid_array in
> disguise. Let's switch to using the generic structure, which shortens
> and simplifies the code slightly.
>
> There's one slightly awkward part: in the old code we copied a hash
> straight from the mmap'd on-disk data into the final object_id. And now
> we'll copy to a temporary oid, which we'll then pass to
> oid_array_append(). But this is an operation we have to do all over the
> commit-graph code already, since it mostly uses object_id structs
> internally. I also measured "git commit-graph --append", which triggers
> this code path, and it showed no difference.
I noticed that you also dropped the pre-allocation logic, which I think
is the right thing to do (that is, removing it, not keeping it around).
It may be worth a mention here, though.
> @@ -2199,26 +2177,16 @@ int write_commit_graph(struct object_directory *odb,
> }
>
> ctx->approx_nr_objects = approximate_object_count();
> - ctx->oids.alloc = ctx->approx_nr_objects / 32;
>
> - if (ctx->split && opts && ctx->oids.alloc > opts->max_commits)
> - ctx->oids.alloc = opts->max_commits;
One compelling reason to drop this logic is that we only have the
oid-array internals touching the .alloc variable, and we're not munging
with it ourselves (running the risk of getting it out-of-sync with the
actual number of bytes allocated).
> -
> - if (ctx->append) {
> + if (ctx->append)
> prepare_commit_graph_one(ctx->r, ctx->odb);
Good, this still needs to happen here.
> - if (ctx->r->objects->commit_graph)
> - ctx->oids.alloc += ctx->r->objects->commit_graph->num_commits;
> - }
> -
> - if (ctx->oids.alloc < 1024)
> - ctx->oids.alloc = 1024;
> - ALLOC_ARRAY(ctx->oids.list, ctx->oids.alloc);
>
> if (ctx->append && ctx->r->objects->commit_graph) {
> struct commit_graph *g = ctx->r->objects->commit_graph;
> for (i = 0; i < g->num_commits; i++) {
> - const unsigned char *hash = g->chunk_oid_lookup + g->hash_len * i;
> - hashcpy(ctx->oids.list[ctx->oids.nr++].hash, hash);
> + struct object_id oid;
> + hashcpy(oid.hash, g->chunk_oid_lookup + g->hash_len * i);
> + oid_array_append(&ctx->oids, &oid);
And this must be the spot that you're talking about that requires the
extra copy. I think that we could certainly live with what you have
here.
Thanks,
Taylor
next prev parent reply other threads:[~2020-12-04 19:15 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-04 18:48 [PATCH 0/9] misc commit-graph and oid-array cleanups Jeff King
2020-12-04 18:48 ` [PATCH 1/9] oid-array.h: drop sha1 mention from header guard Jeff King
2020-12-04 18:49 ` [PATCH 2/9] t0064: drop sha1 mention from filename Jeff King
2020-12-04 18:50 ` [PATCH 3/9] t0064: make duplicate tests more robust Jeff King
2020-12-04 18:51 ` [PATCH 4/9] cache.h: move hash/oid functions to hash.h Jeff King
2020-12-04 18:52 ` [PATCH 5/9] oid-array: make sort function public Jeff King
2020-12-04 18:53 ` [PATCH 6/9] oid-array: provide a for-loop iterator Jeff King
2020-12-04 19:05 ` Taylor Blau
2020-12-04 19:11 ` Taylor Blau
2020-12-04 19:52 ` Jeff King
2020-12-04 19:51 ` Jeff King
2020-12-04 19:18 ` Eric Sunshine
2020-12-04 20:44 ` Jeff King
2020-12-04 20:57 ` Eric Sunshine
2020-12-04 21:54 ` Junio C Hamano
2020-12-07 19:05 ` Jeff King
2020-12-04 18:56 ` [PATCH 7/9] commit-graph: drop count_distinct_commits() function Jeff King
2020-12-04 20:06 ` Derrick Stolee
2020-12-04 20:42 ` Jeff King
2020-12-04 20:47 ` Derrick Stolee
2020-12-04 20:50 ` Jeff King
2020-12-04 21:01 ` Derrick Stolee
2020-12-05 2:26 ` Ævar Arnfjörð Bjarmason
2020-12-07 19:01 ` Jeff King
2020-12-04 18:57 ` [PATCH 8/9] commit-graph: replace packed_oid_list with oid_array Jeff King
2020-12-04 19:14 ` Taylor Blau [this message]
2020-12-04 18:58 ` [PATCH 9/9] commit-graph: use size_t for array allocation and indexing Jeff King
2020-12-04 19:15 ` [PATCH 0/9] misc commit-graph and oid-array cleanups Taylor Blau
2020-12-04 20:08 ` Derrick Stolee
2020-12-07 19:10 ` [PATCH v2 " Jeff King
2020-12-07 19:10 ` [PATCH v2 1/9] oid-array.h: drop sha1 mention from header guard Jeff King
2020-12-07 19:10 ` [PATCH v2 2/9] t0064: drop sha1 mention from filename Jeff King
2020-12-07 19:10 ` [PATCH v2 3/9] t0064: make duplicate tests more robust Jeff King
2020-12-07 19:10 ` [PATCH v2 4/9] cache.h: move hash/oid functions to hash.h Jeff King
2020-12-07 19:10 ` [PATCH v2 5/9] oid-array: make sort function public Jeff King
2020-12-07 19:11 ` [PATCH v2 6/9] oid-array: provide a for-loop iterator Jeff King
2020-12-07 19:11 ` [PATCH v2 7/9] commit-graph: drop count_distinct_commits() function Jeff King
2020-12-07 19:11 ` [PATCH v2 8/9] commit-graph: replace packed_oid_list with oid_array Jeff King
2020-12-07 19:11 ` [PATCH v2 9/9] commit-graph: use size_t for array allocation and indexing Jeff King
2020-12-07 19:26 ` [PATCH v2 0/9] misc commit-graph and oid-array cleanups Derrick Stolee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=X8qKpL+mIFbjngwl@nand.local \
--to=me@ttaylorr.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).