From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 0/3] Using a bit more decoration
Date: Mon, 8 Apr 2013 23:51:26 -0400 [thread overview]
Message-ID: <20130409035126.GA17319@sigill.intra.peff.net> (raw)
In-Reply-To: <1365462114-8630-1-git-send-email-gitster@pobox.com>
On Mon, Apr 08, 2013 at 04:01:51PM -0700, Junio C Hamano wrote:
> This comes on top of the two "decorate" patches that introduced the
> clear_decoration() function.
> [...]
> Junio C Hamano (3):
> commit: shrink "indegree" field
> commit: rename parse_commit_date()
> commit: add get_commit_encoding()
Neat. Reading through, I didn't notice anything obviously wrong with any
of the patches (though there is one gcc warning, which I'll respond to
separately).
It does make me a little nervous to have code that almost never gets
exercised (i.e., when indegree is really high, or a large number of
encodings). It seems like a bug waiting to happen when somebody does hit
that condition. But the decoration lookups do look fairly simple and
easy to get right.
> I am not sure if pre-parsing and sharing the encoding values in-core
> would affect performance very much, but now we have 2 bytes (or 6
> bytes, if you are on 64-bit archs) free to use in "struct commit"
> for later use (e.g. extra object flags that are relevant only for
> commit objects, without having to widen "struct object").
I wasn't able to measure any speedup, but I'm not surprised; something
like "git log" already spends way more effort extracting and
pretty-printing the commit objects.
I was actually thinking of something related to this recently. It would
be nice to be able to allocate a slab of very efficient fixed-size
per-commit or per-object storage. That would eliminate the need to pay
the cost for "indegree" all the time; the topo-sort could allocate a
slab of indegree integers, then throw it away when the sort was done.
Similarly, a traversal would not have to worry about pollution of the
object flags, nor about using an arbitrarily large number of flags[1],
if it could allocate a separate flag structure.
Using a separate hash would be too slow. I'm thinking more like giving
each commit an integer "id" as it is parsed, and then using that id to
index into a temporary slab (you'd grow the slab as you see more commits
but that's OK, since we're always indexing it, not using a pointer). The
slab management would get amortized, so the main cost would be the extra
indirect memory access (i.e., hitting "flags[commit->id]" instead of
"commit->id"), and the extra memory used by the id field.
I dunno. Maybe those costs would make it too slow. I haven't actually
coded or measured anything yet.
-Peff
[1] The thing that made me think about this was making a version of
paint_down_to_common that could keep a separate color for each
source, rather than only PARENT1 and PARENT2. That would let us do
the "--contains" traversal in a breadth-first way, but calculate the
answer simultaneously for all sources (i.e., avoid the depth-first
"go to roots" problem that the current "tag --contains" has if you
do not use timestamp cutoffs).
next prev parent reply other threads:[~2013-04-09 3:51 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-08 6:14 [PATCH 2/2] decorate: add "clear_decoration()" Junio C Hamano
2013-04-08 21:09 ` Jeff King
2013-04-08 21:22 ` Junio C Hamano
2013-04-08 23:01 ` [PATCH 0/3] Using a bit more decoration Junio C Hamano
2013-04-08 23:01 ` [PATCH 1/3] commit: shrink "indegree" field Junio C Hamano
2013-04-09 3:52 ` Jeff King
2013-04-08 23:01 ` [PATCH 2/3] commit: rename parse_commit_date() Junio C Hamano
2013-04-08 23:01 ` [PATCH 3/3] commit: add get_commit_encoding() Junio C Hamano
2013-04-09 3:51 ` Jeff King [this message]
2013-04-09 4:17 ` [PATCH 0/3] Using a bit more decoration Junio C Hamano
2013-04-09 4:39 ` Jeff King
2013-04-09 4:54 ` Junio C Hamano
2013-04-09 6:52 ` Jeff King
2013-04-09 20:06 ` Junio C Hamano
2013-04-09 4:37 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130409035126.GA17319@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).