From: Jeff King <peff@peff.net>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org, Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH 4/7] for_each_packed_object: support iterating in pack-order
Date: Thu, 16 Aug 2018 13:36:39 -0400 [thread overview]
Message-ID: <20180816173639.GB882@sigill.intra.peff.net> (raw)
In-Reply-To: <2ac79ea0-dd58-965d-1158-d08a7731e25d@gmail.com>
On Wed, Aug 15, 2018 at 09:28:44AM -0400, Derrick Stolee wrote:
> On 8/10/2018 7:15 PM, Jeff King wrote:
> > diff --git a/commit-graph.c b/commit-graph.c
> > index b0a55ad128..69a0d1c203 100644
> > --- a/commit-graph.c
> > +++ b/commit-graph.c
> > @@ -730,7 +730,7 @@ void write_commit_graph(const char *obj_dir,
> > die("error adding pack %s", packname.buf);
> > if (open_pack_index(p))
> > die("error opening index for %s", packname.buf);
> > - for_each_object_in_pack(p, add_packed_commits, &oids);
> > + for_each_object_in_pack(p, add_packed_commits, &oids, 0);
> > close_pack(p);
> > }
>
> This use in write_commit_graph() is actually a good candidate for
> pack-order, since we are checking each object to see if it is a commit. This
> is only used when running `git commit-graph write --stdin-packs`, which is
> how VFS for Git maintains the commit-graph.
>
> I have a note to run performance tests on this case and follow up with a
> change on top of this series that adds the FOR_EACH_OBJECT_PACK_ORDER flag.
I doubt that it will show the dramatic improvement in CPU that I
mentioned in my commit message, because most of that comes from more
efficient use of the delta cache. But it's very rare for commits to be
deltas (usually it's just almost-twins due to cherry-picks and rebases).
So you may benefit from block cache efficiency on a cold-cache or on a
system under memory pressure, but I wouldn't expect much change at all
for the warm-cache case.
I doubt it will hurt, though; you'll pay for the revindex generation,
but that's probably not a big deal compared to walking all the objects.
One thing you _could_ do is stop walking through the pack when you see a
non-commit, since we stick all of the commits at the front. But that's
just what the code happens to do, and not a strict promise. So I think
it's a bad idea to rely on it (and in fact the delta-islands work under
discussion elsewhere will break that assumption).
-Peff
next prev parent reply other threads:[~2018-08-16 17:36 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-10 23:07 [PATCH 0/7] speeding up cat-file by reordering object access Jeff King
2018-08-10 23:09 ` [PATCH 1/7] for_each_*_object: store flag definitions in a single location Jeff King
2018-08-10 23:27 ` Stefan Beller
2018-08-10 23:31 ` Jeff King
2018-08-10 23:33 ` Jeff King
2018-08-10 23:39 ` Stefan Beller
2018-08-11 0:33 ` Jeff King
2018-08-10 23:09 ` [PATCH 2/7] for_each_*_object: take flag arguments as enum Jeff King
2018-08-10 23:11 ` [PATCH 3/7] for_each_*_object: give more comprehensive docstrings Jeff King
2018-08-10 23:15 ` [PATCH 4/7] for_each_packed_object: support iterating in pack-order Jeff King
2018-08-15 13:28 ` Derrick Stolee
2018-08-16 17:36 ` Jeff King [this message]
2018-08-10 23:16 ` [PATCH 5/7] t1006: test cat-file --batch-all-objects with duplicates Jeff King
2018-08-10 23:17 ` [PATCH 6/7] cat-file: rename batch_{loose,packed}_object callbacks Jeff King
2018-08-10 23:24 ` [PATCH 7/7] cat-file: support "unordered" output for --batch-all-objects Jeff King
2018-08-13 18:45 ` [PATCH 0/7] speeding up cat-file by reordering object access Jonathan Tan
2018-08-14 18:13 ` [PATCH 0/4] finishing touches on jk/for-each-object-iteration Jeff King
2018-08-14 18:14 ` [PATCH 1/4] cat-file: use oidset check-and-insert Jeff King
2018-08-14 18:18 ` [PATCH 2/4] cat-file: split batch "buf" into two variables Jeff King
2018-08-14 18:20 ` [PATCH 3/4] cat-file: use a single strbuf for all output Jeff King
2018-08-14 19:30 ` René Scharfe
2018-08-14 19:39 ` Jeff King
2018-08-14 18:21 ` [PATCH 4/4] for_each_*_object: move declarations to object-store.h Jeff King
2018-08-15 14:05 ` [PATCH 0/7] speeding up cat-file by reordering object access Derrick Stolee
2018-08-16 17:39 ` Jeff King
2018-08-16 18:52 ` Junio C Hamano
2018-08-16 19:45 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180816173639.GB882@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).