From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
Thomas Rast <trast@inf.ethz.ch>,
Piotr Krukowiecki <piotr.krukowiecki@gmail.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: git status: small difference between stating whole repository and small subdirectory
Date: Mon, 20 Feb 2012 15:17:49 -0500 [thread overview]
Message-ID: <20120220201749.GA5405@sigill.intra.peff.net> (raw)
In-Reply-To: <7vfwe5l13w.fsf@alter.siamese.dyndns.org>
On Mon, Feb 20, 2012 at 12:08:03PM -0800, Junio C Hamano wrote:
> > 4. At the end of unpack_trees, we forget about src_index, and copy
> > o->result into *o->dst_index byte for byte. I.e., we overwrite
> > the_index.cache_tree, which has been properly updated the whole
> > time,
>
> I strongly suspect that "properly updated" part needs to be thoroughly
> audited. I wouldn't be surprised that this behaviour is what we did when
> we split src_index vs dst_index when he rewrote unpack_trees() in order to
> emulate the original "unpack-trees is beyond salvation because it does not
> maintain cache tree correctly, just nuke it" behaviour.
Yep, I am also concerned about that.
> > But it does not actually insert the _destination_ tree into the cache
> > tree. Which we can do in certain situations, but only if there were no
> > paths in the tree that were left unchanged (e.g., you modify "foo", then
> > "git checkout HEAD^", which updates "bar". Your tree does not match
> > HEAD^, and must be invalidated). While it would be cool to be able to
> > handle those complex cases,...
>
> It may look cool but it may not be a good change. You are spending extra
> cycles to optimize for the next write-tree that may not happen before the
> index is further updated.
I don't think it would be too many cycles; you would have to mark each
tree you enter as having items from the left-hand tree or the right-hand
tree. If only one, you can reuse the cache-tree entry (or tree sha1, if
coming from a tree). Otherwise, you must invalidate. And it doesn't
just help the next write-tree, but any intermediate index diffs.
Of course any such change would need timings to justify it, though.
That being said, I think just invalidating really covers 99% of the
cases. What we really care about is that when I modify kernel/foo.c, the
~2300 other directories (besides "" and "kernel") don't need rebuilt,
and that is relatively simple to do. Even if doing it the other way
produced a tiny speedup, I would be concerned with the increase in code
complexity.
> > I think this implementation matches the intent of the original calls to
> > cache_tree_invalidate_path sprinkled throughout unpack-trees.c.
>
> Yes, and as long as we invalidate all the directories that need to be
> invalidated during the unpack-tree operation, I think it is a correct
> thing to do.
OK. I'll do some reading of the code to convince myself that the
unpack_trees callbacks are doing the right thing. I'm not sure of a good
automatic test that would detect a failure there. Just making test cases
is going to end up too contrived, unless we are missing something really
obvious.
I'm thinking maybe something like replaying the commit history of
linux-2.6 and making sure that each the tree generated by the cache-tree
in each case matches the actual committed tree.
> > But I
> > have to say that it seems a little odd for us to be modifying the
> > o->src_index throughout the whole thing.
>
> Yes, that part is logically *wrong*. I think it is a remnant from the
> days when there was no distinction between src_index and dst_index.
OK. I'll include a fix for that in the series I prepare.
-Peff
next prev parent reply other threads:[~2012-02-20 20:17 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-10 9:42 git status: small difference between stating whole repository and small subdirectory Piotr Krukowiecki
2012-02-10 12:33 ` Nguyen Thai Ngoc Duy
2012-02-10 13:46 ` Piotr Krukowiecki
2012-02-10 14:37 ` Nguyen Thai Ngoc Duy
2012-02-13 16:54 ` Piotr Krukowiecki
2012-02-10 16:18 ` Piotr Krukowiecki
2012-02-14 11:34 ` Thomas Rast
2012-02-15 8:57 ` Piotr Krukowiecki
2012-02-15 11:01 ` Nguyen Thai Ngoc Duy
2012-02-15 15:14 ` Piotr Krukowiecki
2012-02-16 13:22 ` Piotr Krukowiecki
2012-02-15 19:03 ` Jeff King
2012-02-16 13:37 ` Piotr Krukowiecki
2012-02-16 14:05 ` Thomas Rast
2012-02-16 20:15 ` Junio C Hamano
2012-02-17 16:55 ` Piotr Krukowiecki
2012-02-16 19:20 ` Jeff King
2012-02-17 17:19 ` Piotr Krukowiecki
2012-02-17 20:37 ` Jeff King
2012-02-17 22:25 ` Junio C Hamano
2012-02-17 22:29 ` Jeff King
2012-02-20 8:25 ` Piotr Krukowiecki
2012-02-20 14:06 ` Jeff King
2012-02-20 14:09 ` Thomas Rast
2012-02-20 14:36 ` Nguyen Thai Ngoc Duy
2012-02-20 14:39 ` Jeff King
2012-02-20 15:11 ` Jeff King
2012-02-20 18:45 ` Thomas Rast
2012-02-20 20:35 ` Jeff King
2012-02-20 22:04 ` Junio C Hamano
2012-02-20 22:41 ` Jeff King
2012-02-20 23:31 ` Junio C Hamano
2012-02-21 7:21 ` Piotr Krukowiecki
2012-02-20 20:08 ` Junio C Hamano
2012-02-20 20:17 ` Jeff King [this message]
2012-02-21 14:45 ` Nguyen Thai Ngoc Duy
2012-02-21 19:16 ` Junio C Hamano
2012-02-22 2:12 ` Nguyen Thai Ngoc Duy
2012-02-22 2:55 ` Junio C Hamano
2012-02-22 12:54 ` Nguyen Thai Ngoc Duy
2012-02-22 13:17 ` Thomas Rast
2012-02-22 10:34 ` Nguyen Thai Ngoc Duy
2012-02-22 3:32 ` Junio C Hamano
2012-04-10 15:16 ` Piotr Krukowiecki
2012-04-10 16:23 ` Junio C Hamano
2012-04-10 18:00 ` Jeff King
2012-02-20 19:57 ` Junio C Hamano
2012-02-20 19:59 ` Thomas Rast
2012-02-20 14:16 ` Nguyen Thai Ngoc Duy
2012-02-20 14:22 ` Jeff King
2012-02-20 19:56 ` Junio C Hamano
2012-02-20 20:09 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120220201749.GA5405@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=piotr.krukowiecki@gmail.com \
--cc=trast@inf.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).