From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Nicolas Pitre <nico@cam.org>,
Joshua Jensen <jjensen@workspacewhiz.com>,
git@vger.kernel.org
Subject: Re: Reverting an uncommitted revert
Date: Wed, 20 May 2009 10:50:39 -0700 [thread overview]
Message-ID: <7vfxezzms0.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <20090520032139.GB10212@coredump.intra.peff.net> (Jeff King's message of "Tue\, 19 May 2009 23\:21\:39 -0400")
Jeff King <peff@peff.net> writes:
> Related to this, I have wondered if it might be useful to have an "index
> reflog". If I do something like this:
>
> $ git add foo
> $ hack hack hack
> $ git add foo
>
> Then the first added state of "foo" is available in the object database,
> but it is not connected to the name "foo" in any way, which makes it
> much harder to find. If we had a reflog pointing to trees representing
> the index state after each change, then it would be simple (you could
> look at "INDEX@{1}:foo" or similar).
>
> I don't know if the performance is an issue. We are writing an extra
> tree every time we touch the index, but in many cases you are already
> writing a blob.
It is not just "an extra tree every time". For example, in the kernel
repository, one of the path that is deepest [*1*] (i.e. whose modification
affects the most number of trees) is:
arch/cris/include/arch-v32/mach-a3/mach/hwregs/iop/asm/iop_reg_space_asm.h
If you modify this file and then "git add", and if you write-tree the
index at that point, you need to write a tree object for ".", arch/,
arch/cris, ..., arch/cris/include/arch-v32/mach-a3/mach/hwregs/iop/asm, 10
trees in total (if I am counting them right ;-).
If your cache-tree is fresh (and if you "git write-tree" every time you
"git add", that will make it stay fresh), you do not have to recompute
object names of other 1728 tree objects (they are unchanged) [*2*], which
should help somewhat, but the majority of time is spent in the I/O (and
perhaps slow fsync on ext3 ;-) of writing these 10 tree objects [*3*].
People like Shawn who work with Java projects, where the tree hierarchy
tends to be (unnecessary) deep with prefixes like org/spearce/jgit due to
the namespace issues will have bigger overhead than a relatively shallow
project like git.git itself.
[Footnotes]
*1* You can find it out yourself with...
git ls-files "$(
git ls-files |
sed -e 's|[^/]||g' |
sort -u |
tail -n 1 |
sed -e 's|/|*/|g' -e 's/$/*/'
)" | head -n 1
*2* The total number of tree objects in a commit is...
echo $(git ls-tree -r -d HEAD | wc -l) 1 + p | dc
*3* write-tree with or without help from cache-tree in the kernel
repository with a hot cache (we are talking about running "git write-tree"
every time you do "git add" so the cold cache case does not matter) looks
like this:
$ l=arch/cris/include/arch-v32/mach-a3/mach/hwregs/iop/asm/iop_reg_space_asm.h
$ echo >>$l && git add $l
$ /usr/bin/time git write-tree
04bc92c40a5d0f0d44e162e140cb00964a52046b
0.02user 0.01system 0:00.03elapsed 102%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6387minor)pagefaults 0swaps
$ git reset --hard
$ echo >>$l && git add $l
$ /usr/bin/time git write-tree --ignore-cache-tree
04bc92c40a5d0f0d44e162e140cb00964a52046b
0.13user 0.04system 0:00.17elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+5336outputs (0major+17141minor)pagefaults 0swaps
(The numbers are from my Athlon(tm) 64 X2 3800+ with slow IDE disks).
next prev parent reply other threads:[~2009-05-20 17:50 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-20 2:34 Reverting an uncommitted revert Joshua Jensen
2009-05-20 3:10 ` Nicolas Pitre
2009-05-20 3:21 ` Jeff King
2009-05-20 3:35 ` Nicolas Pitre
2009-05-20 3:38 ` Jeff King
2009-05-20 4:58 ` Ping Yin
2009-05-20 9:15 ` Wincent Colaiuta
2009-05-20 10:16 ` Jakub Narebski
2009-05-20 12:53 ` Nicolas Pitre
2009-05-20 14:17 ` Shawn O. Pearce
2009-05-20 16:55 ` Eric Raible
2009-05-20 17:59 ` Junio C Hamano
2009-05-20 18:19 ` Nicolas Pitre
2009-05-20 18:25 ` Nicolas Pitre
2009-05-20 18:57 ` Shawn O. Pearce
2009-05-21 6:16 ` Junio C Hamano
2009-05-20 18:21 ` Jakub Narebski
2009-05-20 15:23 ` Wincent Colaiuta
2009-05-20 15:47 ` Nicolas Pitre
2009-05-20 16:13 ` Sverre Rabbelier
2009-05-20 16:58 ` Jeff King
2009-05-20 18:04 ` Nicolas Pitre
2009-05-20 18:08 ` Sverre Rabbelier
2009-05-21 3:47 ` Jeff King
2009-05-20 17:50 ` Junio C Hamano [this message]
2009-05-20 18:27 ` [PATCH] write-tree --ignore-cache-tree Junio C Hamano
2009-05-21 0:40 ` [PATCH 1/2] cache-tree.c::cache_tree_find(): simplify inernal API Junio C Hamano
2009-05-21 0:44 ` [PATCH 2/2] Optimize "diff-index --cached" using cache-tree Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vfxezzms0.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=jjensen@workspacewhiz.com \
--cc=nico@cam.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).