git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] read_directory: avoid invoking exclude machinery on tracked files
@ 2013-02-15 14:17 Nguyễn Thái Ngọc Duy
  2013-02-15 16:52 ` Junio C Hamano
  2013-02-16  7:17 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
  0 siblings, 2 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-02-15 14:17 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Karsten Blees, kusmabite, Ramkumar Ramachandra,
	Robert Zeh, finnag, Nguyễn Thái Ngọc Duy

read_directory() (and its friendly wrapper fill_directory) collects
untracked/ignored files by traversing through the whole worktree (*),
feeding every entry to treat_one_path(), where each entry is checked
against .gitignore patterns.

One may see that tracked files can't be excluded and we do not need to
run them through exclude machinery. On repos where there are many
.gitignore patterns and/or a lot of tracked files, this unnecessary
processing can become expensive.

This patch avoids it mostly for normal cases. Directories are still
processed as before. DIR_SHOW_IGNORED and DIR_COLLECT_IGNORED are not
normally used unless some options are given (e.g. "checkout
--overwrite-ignore", "add -f"...) so people still need to pay penalty
in some cases, just not as often as before.

git status   | webkit linux-2.6 libreoffice-core gentoo-x86
-------------+----------------------------------------------
before       | 1.159s    0.226s           0.415s     0.597s
after        | 0.778s    0.176s           0.266s     0.556s
nr. patterns |    89       376               19          0
nr. tracked  |   182k       40k              63k       101k

(*) Not completely true. read_directory may skip recursing into a
    directory if it's entirely excluded and DIR_SHOW_OTHER_DIRECTORIES
    is not set.

Tracked-down-by: Karsten Blees <karsten.blees@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 For reference:
 http://thread.gmane.org/gmane.comp.version-control.git/215820/focus=216195

 dir.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/dir.c b/dir.c
index 57394e4..bdff256 100644
--- a/dir.c
+++ b/dir.c
@@ -1244,7 +1244,19 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 					  const struct path_simplify *simplify,
 					  int dtype, struct dirent *de)
 {
-	int exclude = is_excluded(dir, path->buf, &dtype);
+	int exclude;
+
+	if (dtype == DT_UNKNOWN)
+		dtype = get_dtype(de, path->buf, path->len);
+
+	if (!(dir->flags & DIR_SHOW_IGNORED) &&
+	    !(dir->flags & DIR_COLLECT_IGNORED) &&
+	    dtype != DT_DIR &&
+	    cache_name_exists(path->buf, path->len, ignore_case))
+		return path_ignored;
+
+	exclude = is_excluded(dir, path->buf, &dtype);
+
 	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
 	    && exclude_matches_pathspec(path->buf, path->len, simplify))
 		dir_add_ignored(dir, path->buf, path->len);
@@ -1256,9 +1268,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	if (exclude && !(dir->flags & DIR_SHOW_IGNORED))
 		return path_ignored;
 
-	if (dtype == DT_UNKNOWN)
-		dtype = get_dtype(de, path->buf, path->len);
-
 	switch (dtype) {
 	default:
 		return path_ignored;
-- 
1.8.1.2.536.gf441e6d

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-02-25 22:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-15 14:17 [PATCH] read_directory: avoid invoking exclude machinery on tracked files Nguyễn Thái Ngọc Duy
2013-02-15 16:52 ` Junio C Hamano
2013-02-15 18:30   ` Duy Nguyen
2013-02-15 19:32     ` Junio C Hamano
2013-02-16  3:31       ` Duy Nguyen
2013-02-18 16:42       ` Karsten Blees
2013-02-16  7:17 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2013-02-16 18:11   ` Pete Wyckoff
2013-02-17  4:39     ` Duy Nguyen
2013-02-17 15:49       ` Pete Wyckoff
2013-02-17 23:18   ` Junio C Hamano
2013-02-25 22:01   ` [PATCH/RFC] dir.c: Make git-status --ignored even more consistent Karsten Blees

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).