git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josh Triplett <josh@joshtriplett.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Anders Darander <anders.darander@gmail.com>,
	Petr Baudis <pasky@ucw.cz>
Subject: Re: git stash takes excessively long when many untracked files present
Date: Thu, 15 Aug 2013 11:07:36 -0700	[thread overview]
Message-ID: <20130815180736.GA4093@jtriplet-mobl1> (raw)
In-Reply-To: <7vr4durgd4.fsf@alter.siamese.dyndns.org>

On Thu, Aug 15, 2013 at 10:52:39AM -0700, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > In any case, this is a regression introduced in 'master' since the
> > last release, and the attempted fix was for an issue that has long
> > been with us, so I'll revert a7365313 (git stash: avoid data loss
> > when "git stash save" kills a directory, 2013-06-28) soon.  For
> > today's -rc3, I'm already deep into the integration cycle, so it is
> > too late to do the revert it and then redo everything.
> >
> > Then we will plan to re-apply the patch once "ls-files --killed"
> > gets fixed not to waste too much cycles needlessly, after the coming
> > release.
> 
> I've already reverted the problematic patch to "git stash" and it
> will not be part of the upcoming release.

Thanks!

> Here is a quick attempt to see if we can do better in "ls-files -k".
> 
> We have an existing test t3010.3 that tries all the combinations of
> directory turning into a regular file, symlink, etc. and vice versa,
> and it seems to pass.  The test has a directory path6 in the working
> tree without any paths in it in the index, and the added bypass code
> seems to correctly trigger and prevents us from digging into that
> directory, so this patch may be sufficient to improve "ls-files -k".
> 
> By the way, regarding the reverted commit, I do not think it is
> enough to ask "ls-files -k" to see if the state recorded in the
> current index is sufficient.  Imagine your HEAD records "path" as a
> file and then you did this:
> 
>     $ git reset --hard ;# "path" is now a regular file
>     $ mv path path.bak
>     $ mkdir path
>     $ mv path.bak path/file
>     $ git add -A ;# "path/file" in the index and in the working tree
>     $ >path/cruft ;# "path/cruft" in the working tree
> 
> Then call "save_stash" without saving untracked.  The resulting
> stash will save the contents of "path/file" but "path/cruft" is not
> recorded anywhere, and then we would need to bring the state in the
> working tree and the index back to the state recorded in HEAD, hence
> "path" needs to be turned back to a directory.
> 
> But "ls-files -k" is asked to check with the index, which has the
> path as a directory, so this case is missed.

Since git stash resets to the state in HEAD, whatever --killed check it
does needs to check against HEAD, yes.  It still doesn't need to check
any path that doesn't exist in HEAD, though; it makes more sense to
drive this from the list of files in HEAD rather than from the list of
files in the working directory, even with a filter applied to the latter
to prune bits not in HEAD.

> So instead of
> 
> 	test -n "$(git ls-files --killed | head -n 1)"
> 
> in Pasky's patch, which probably is a right thing to do if you are
> running "git stash save --keep-index", you would need something like
> this if you are not running with "--keep-index":
> 
> 	test -n "$(
>         	GIT_INDEX_FILE=tmp_index
>                 export GIT_INDEX_FILE
>                 git read-tree HEAD
>                 git ls-files -k
> 	)"
> 
> in order to make sure that the result of going back to the state in
> the HEAD will not clobber leftover "path/cruft".

Sure, that works.  However, wouldn't it make sense to just directly let
git ls-files output to the screen, then test its return value (after
adding some ls-files option to set the return value)?  Since ls-files
--killed will have no output if git stash can proceed, and since git
stash should show the list of files that'd be killed before it fails,
using the output directly makes sense.

- Josh Triplett

  reply	other threads:[~2013-08-15 18:07 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-10 21:44 git stash takes excessively long when many untracked files present Josh Triplett
2013-08-13 10:11 ` Anders Darander
2013-08-13 17:07   ` Junio C Hamano
2013-08-13 17:36     ` Anders Darander
2013-08-13 17:52       ` Junio C Hamano
2013-08-13 21:47         ` Junio C Hamano
2013-08-15 17:52           ` Junio C Hamano
2013-08-15 18:07             ` Josh Triplett [this message]
2013-08-15 18:58               ` Junio C Hamano
2013-08-15 19:47             ` Junio C Hamano
2013-08-15 21:28               ` [PATCH 0/3] Optimizing "ls-files -k" Junio C Hamano
2013-08-15 21:28                 ` [PATCH 1/3] dir.c: use the cache_* macro to access the current index Junio C Hamano
2013-08-15 21:28                 ` [PATCH 2/3] ls-files -k: a directory only can be killed if the index has a non-directory Junio C Hamano
2013-08-15 21:28                 ` [PATCH 3/3] t3010: update to demonstrate "ls-files -k" optimization pitfalls Junio C Hamano
2013-08-15 23:30                 ` [PATCH 4/3] git stash: avoid data loss when "git stash save" kills a directory Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130815180736.GA4093@jtriplet-mobl1 \
    --to=josh@joshtriplett.org \
    --cc=anders.darander@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pasky@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).