From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: Anders Darander <anders.darander@gmail.com>,
Petr Baudis <pasky@ucw.cz>, Josh Triplett <josh@joshtriplett.org>
Subject: Re: git stash takes excessively long when many untracked files present
Date: Thu, 15 Aug 2013 10:52:39 -0700 [thread overview]
Message-ID: <7vr4durgd4.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <7v61v9w9dy.fsf@alter.siamese.dyndns.org> (Junio C. Hamano's message of "Tue, 13 Aug 2013 14:47:37 -0700")
Junio C Hamano <gitster@pobox.com> writes:
> In any case, this is a regression introduced in 'master' since the
> last release, and the attempted fix was for an issue that has long
> been with us, so I'll revert a7365313 (git stash: avoid data loss
> when "git stash save" kills a directory, 2013-06-28) soon. For
> today's -rc3, I'm already deep into the integration cycle, so it is
> too late to do the revert it and then redo everything.
>
> Then we will plan to re-apply the patch once "ls-files --killed"
> gets fixed not to waste too much cycles needlessly, after the coming
> release.
I've already reverted the problematic patch to "git stash" and it
will not be part of the upcoming release.
Here is a quick attempt to see if we can do better in "ls-files -k".
We have an existing test t3010.3 that tries all the combinations of
directory turning into a regular file, symlink, etc. and vice versa,
and it seems to pass. The test has a directory path6 in the working
tree without any paths in it in the index, and the added bypass code
seems to correctly trigger and prevents us from digging into that
directory, so this patch may be sufficient to improve "ls-files -k".
By the way, regarding the reverted commit, I do not think it is
enough to ask "ls-files -k" to see if the state recorded in the
current index is sufficient. Imagine your HEAD records "path" as a
file and then you did this:
$ git reset --hard ;# "path" is now a regular file
$ mv path path.bak
$ mkdir path
$ mv path.bak path/file
$ git add -A ;# "path/file" in the index and in the working tree
$ >path/cruft ;# "path/cruft" in the working tree
Then call "save_stash" without saving untracked. The resulting
stash will save the contents of "path/file" but "path/cruft" is not
recorded anywhere, and then we would need to bring the state in the
working tree and the index back to the state recorded in HEAD, hence
"path" needs to be turned back to a directory.
But "ls-files -k" is asked to check with the index, which has the
path as a directory, so this case is missed.
So instead of
test -n "$(git ls-files --killed | head -n 1)"
in Pasky's patch, which probably is a right thing to do if you are
running "git stash save --keep-index", you would need something like
this if you are not running with "--keep-index":
test -n "$(
GIT_INDEX_FILE=tmp_index
export GIT_INDEX_FILE
git read-tree HEAD
git ls-files -k
)"
in order to make sure that the result of going back to the state in
the HEAD will not clobber leftover "path/cruft".
builtin/ls-files.c | 2 ++
dir.c | 9 +++++++++
dir.h | 3 ++-
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 5cf3e31..8500446 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -219,6 +219,8 @@ static void show_files(struct dir_struct *dir)
/* For cached/deleted files we don't need to even do the readdir */
if (show_others || show_killed) {
+ if (!show_others)
+ dir->flags |= DIR_COLLECT_KILLED_ONLY;
fill_directory(dir, pathspec);
if (show_others)
show_other_files(dir);
diff --git a/dir.c b/dir.c
index 910bfcd..02939e2 100644
--- a/dir.c
+++ b/dir.c
@@ -1183,6 +1183,15 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
cache_name_exists(path->buf, path->len, ignore_case))
return path_none;
+ /*
+ * A directory can only contain killed files if the index
+ * has a path that wants it to be a non-directory.
+ */
+ if ((dir->flags & DIR_COLLECT_KILLED_ONLY) &&
+ (dtype == DT_DIR) &&
+ !cache_name_exists(path->buf, path->len, ignore_case))
+ return path_none;
+
exclude = is_excluded(dir, path->buf, &dtype);
/*
diff --git a/dir.h b/dir.h
index 3d6b80c..4677b86 100644
--- a/dir.h
+++ b/dir.h
@@ -80,7 +80,8 @@ struct dir_struct {
DIR_HIDE_EMPTY_DIRECTORIES = 1<<2,
DIR_NO_GITLINKS = 1<<3,
DIR_COLLECT_IGNORED = 1<<4,
- DIR_SHOW_IGNORED_TOO = 1<<5
+ DIR_SHOW_IGNORED_TOO = 1<<5,
+ DIR_COLLECT_KILLED_ONLY = 1<<6
} flags;
struct dir_entry **entries;
struct dir_entry **ignored;
next prev parent reply other threads:[~2013-08-15 17:53 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-10 21:44 git stash takes excessively long when many untracked files present Josh Triplett
2013-08-13 10:11 ` Anders Darander
2013-08-13 17:07 ` Junio C Hamano
2013-08-13 17:36 ` Anders Darander
2013-08-13 17:52 ` Junio C Hamano
2013-08-13 21:47 ` Junio C Hamano
2013-08-15 17:52 ` Junio C Hamano [this message]
2013-08-15 18:07 ` Josh Triplett
2013-08-15 18:58 ` Junio C Hamano
2013-08-15 19:47 ` Junio C Hamano
2013-08-15 21:28 ` [PATCH 0/3] Optimizing "ls-files -k" Junio C Hamano
2013-08-15 21:28 ` [PATCH 1/3] dir.c: use the cache_* macro to access the current index Junio C Hamano
2013-08-15 21:28 ` [PATCH 2/3] ls-files -k: a directory only can be killed if the index has a non-directory Junio C Hamano
2013-08-15 21:28 ` [PATCH 3/3] t3010: update to demonstrate "ls-files -k" optimization pitfalls Junio C Hamano
2013-08-15 23:30 ` [PATCH 4/3] git stash: avoid data loss when "git stash save" kills a directory Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vr4durgd4.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=anders.darander@gmail.com \
--cc=git@vger.kernel.org \
--cc=josh@joshtriplett.org \
--cc=pasky@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).