From: Jeff King <peff@peff.net>
To: David Turner <dturner@twopensource.com>
Cc: git mailing list <git@vger.kernel.org>
Subject: Re: git reset for index restoration?
Date: Thu, 22 May 2014 14:23:03 -0400 [thread overview]
Message-ID: <20140522182303.GA1167@sigill.intra.peff.net> (raw)
In-Reply-To: <1400782096.18134.1.camel@stross>
On Thu, May 22, 2014 at 02:08:16PM -0400, David Turner wrote:
> On Thu, 2014-05-22 at 12:46 -0400, Jeff King wrote:
> > On Thu, May 22, 2014 at 12:22:43PM -0400, David Turner wrote:
> >
> > > If I have a git repository with a clean working tree, and I delete the
> > > index, then I can use git reset (with no arguments) to recreate it.
> > > However, when I do recreate it, it doesn't come back the same. I have
> > > not analyzed this in detail, but the effect is that commands like git
> > > status take much longer because they must read objects out of a pack
> > > file. In other words, the index seems to not realize that the index (or
> > > at least most of it) represents the same state as HEAD. If I do git
> > > reset --hard, the index is restored to the original state (it's
> > > byte-for-byte identical), and the pack file is no longer read.
> >
> > Are you sure it's reading a packfile?
>
> Well, it's calling inflate(), and strace says it is reading
> e.g. .git/objects/pack/pack-....{idx,pack}.
>
> So, I would say so.
That seems odd that we would be spending extra time there. We do
inflate() the trees in order to diff the index against HEAD, but we
shouldn't need to inflate any blobs.
Here it is for me (on linux.git):
[before, warm cache]
$ time perf record -q git status >/dev/null
real 0m0.192s
user 0m0.080s
sys 0m0.108s
$ perf report | grep -v '#' | head -5
7.46% git [kernel.kallsyms] [k] __d_lookup_rcu
4.55% git libz.so.1.2.8 [.] inflate
3.53% git libc-2.18.so [.] __memcmp_sse4_1
3.46% git [kernel.kallsyms] [k] security_inode_getattr
3.29% git git [.] memihash
$ time git reset
real 0m0.080s
user 0m0.036s
sys 0m0.040s
So status is pretty quick, and the time is going to lstat in the kernel,
and some tree inflation. Reset is fast, because it has nothing much to
do. Now let's kill off the index's stat cache:
$ rm .git/index
$ time perf record -q git reset
real 0m0.967s
user 0m0.780s
sys 0m0.180s
That took a while. What was it doing?
$ perf report | grep -v '#' | head -5
3.23% git [kernel.kallsyms] [k] copy_user_enhanced_fast_string
1.74% git libcrypto.so.1.0.0 [.] 0x000000000007e010
1.60% git [kernel.kallsyms] [k] __d_lookup_rcu
1.51% git [kernel.kallsyms] [k] page_fault
1.44% git libc-2.18.so [.] __memcmp_sse4_1
Reading files and sha1. We hash the working-tree files here (reset
doesn't technically need to refresh the index from the working tree to
copy entries from HEAD into the index, but it does it so it can do fancy
things like tell you about which files are now out-of-date).
Now how does stat fare after this?
$ time perf record -q git status >/dev/null
real 0m0.189s
user 0m0.088s
sys 0m0.096s
Looks about the same as before to me.
Note that if you use "read-tree" instead of "reset", it _just_ loads the
index, and doesn't touch the working tree. If you then run "git status",
then _that_ command has to refresh the index, and it will pay the
hashing cost. Like:
$ rm .git/index
$ time git read-tree HEAD
real 0m0.084s
user 0m0.064s
sys 0m0.016s
$ time git status >/dev/null
real 0m0.833s
user 0m0.712s
sys 0m0.112s
All of this is behaving as I would expect. Can you show us a set of
commands that deviate from this?
-Peff
next prev parent reply other threads:[~2014-05-22 18:23 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-22 16:22 git reset for index restoration? David Turner
2014-05-22 16:46 ` Jeff King
2014-05-22 18:08 ` David Turner
2014-05-22 18:23 ` Jeff King [this message]
2014-05-22 19:26 ` David Turner
2014-05-22 16:46 ` Elijah Newren
2014-05-22 18:17 ` David Turner
2014-05-22 18:39 ` Jeff King
2014-05-22 19:07 ` David Turner
2014-05-22 19:09 ` Jeff King
2014-05-22 19:30 ` Jeff King
2014-05-22 21:34 ` Junio C Hamano
2014-05-22 21:53 ` David Turner
2014-05-22 21:58 ` Junio C Hamano
2014-05-22 22:01 ` David Turner
2014-05-22 22:12 ` Junio C Hamano
2014-05-22 22:18 ` Junio C Hamano
2014-05-22 23:33 ` Duy Nguyen
2014-05-22 23:37 ` David Turner
2014-05-22 22:29 ` Junio C Hamano
2014-05-22 23:02 ` David Turner
2014-05-22 23:14 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140522182303.GA1167@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).