From: Duy Nguyen <pclouds@gmail.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: RFA: untracked cache vs git reset --hard
Date: Wed, 3 May 2017 17:54:49 +0700 [thread overview]
Message-ID: <CACsJy8BasKLSuMuoqT1MNWbp93qxuG1Z+auiM6SaN7fBYT8sFw@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1705031202470.3480@virtualbox>
On Wed, May 3, 2017 at 5:27 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi all,
>
> I have a problem and would like to solicit advice how to fix it.
>
> The untracked cache has made a real difference on rather large
> repositories with tons of directories, and it is really, really useful.
>
> But. One innocuous `git reset --hard` will just blow it away.
>
> How? reset_index() calls unpack_trees() which in turn tries to populate a
> new index and then discards the old one:
>
> https://github.com/git/git/blob/v2.12.2/unpack-trees.c#L1293
>
> That discard_index() unfortunately also blows away each and every index
> extension that had been read carefully before.
This is a real problem when we introduce non-optional extensions (i.e.
extension name in lower case). Dropping them is not an option because
they may contain vital/original information. We haven't any so far,
but I've been wanting to add one for years (narrow clone). So I'm all
for tackling the problem now :)
> All users of `git reset --hard` (including `git stash`) suffer this.
>
> In fact, it looks as if *any* caller of unpack_trees() would suffer the
> same problem: git-am, git-checkout, git-commit, git-merge, etc
>
> Now, I could imagine that maybe we could just "move"
> o->dst_index.untracked to o->result.untracked, and that the machinery then
> would do the right thing.
These extensions may have dependencies in the o->result.cache[] (do we
allow an extension to depend on another?). If invalidation is not
handled correctly then it's not safe to simply copy the extension
over.
For untracked cache, I think we do invalidation right and just moving
it over dst_index (and resetting NULL in o->result so it does not get
accidentally deleted) is fine.
I'd rather we have a common way of dealing with this for any extension
though. Split index needs special treatment too [1]. Maybe we can add
int migrate_index_extensions(struct index_state *dst, struct index_state *src);
in read-cache.c where it calls migrate_XXX() for each extension. In
some cases (cache-tree) we could even do more, like repair cache-tree
there to avoid hitting performance regressions.
[1] https://github.com/git/git/blob/v2.12.2/unpack-trees.c#L1165-L1167
--
Duy
prev parent reply other threads:[~2017-05-03 10:55 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-03 10:27 RFA: untracked cache vs git reset --hard Johannes Schindelin
2017-05-03 10:54 ` Duy Nguyen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CACsJy8BasKLSuMuoqT1MNWbp93qxuG1Z+auiM6SaN7fBYT8sFw@mail.gmail.com \
--to=pclouds@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).