git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Karsten Blees <karsten.blees@gmail.com>
To: Git List <git@vger.kernel.org>
Cc: Karsten Blees <karsten.blees@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Erik Faye-Lund <kusmabite@gmail.com>,
	Ramkumar Ramachandra <artagnon@gmail.com>,
	Robert Zeh <robert.allan.zeh@gmail.com>,
	Duy Nguyen <pclouds@gmail.com>,
	Antoine Pelisse <apelisse@gmail.com>,
	Adam Spiers <git@adamspiers.org>
Subject: [PATCH v2 13/14] dir.c: git-status --ignored: don't scan the work tree three times
Date: Mon, 15 Apr 2013 21:14:22 +0200	[thread overview]
Message-ID: <516C518E.1000405@gmail.com> (raw)
In-Reply-To: <516C4F27.30203@gmail.com>

'git-status --ignored' recursively scans directories up to three times:

 1. To collect untracked files.

 2. To collect ignored files.

 3. When collecting ignored files, to check that an untracked directory
    that potentially contains ignored files doesn't also contain untracked
    files (i.e. isn't already listed as untracked).

Let's get rid of case 3 first.

Currently, read_directory_recursive returns a boolean whether a directory
contains the requested files or not (actually, it returns the number of
files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies
what we're looking for.

To be able to test for both untracked and ignored files in a single scan,
we need to return a bit more info, and the result must be independent of
the DIR_SHOW_IGNORED flag.

Reuse the path_treatment enum as return value of read_directory_recursive.
Split path_handled in two separate values path_excluded and path_untracked
that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't
need an extra value path_untracked_and_excluded, as directories with both
untracked and ignored files should be listed as untracked.

Rename path_ignored to path_none for clarity (i.e. "don't treat that path"
in contrast to "the path is ignored and should be treated according to
DIR_SHOW_IGNORED").

Replace enum directory_treatment with path_treatment. That's just another
enum with the same meaning, no need to translate back and forth.

In treat_directory, get rid of the extra read_directory_recursive call and
all the DIR_SHOW_IGNORED-specific code.

In read_directory_recursive, decide whether to dir_add_name path_excluded
or path_untracked paths based on the DIR_SHOW_IGNORED flag.

The return value of read_directory_recursive is the maximum path_treatment
of all files and sub-directories. In the check_only case, abort when we've
reached the most significant value (path_untracked).

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c | 146 +++++++++++++++++++++++++++++++++---------------------------------
 1 file changed, 72 insertions(+), 74 deletions(-)

diff --git a/dir.c b/dir.c
index 5ae5722..5770ed4 100644
--- a/dir.c
+++ b/dir.c
@@ -17,7 +17,21 @@ struct path_simplify {
 	const char *path;
 };
 
-static int read_directory_recursive(struct dir_struct *dir, const char *path, int len,
+/*
+ * Tells read_directory_recursive how a file or directory should be treated.
+ * Values are ordered by significance, e.g. if a directory contains both
+ * excluded and untracked files, it is listed as untracked because
+ * path_untracked > path_excluded.
+ */
+enum path_treatment {
+	path_none = 0,
+	path_recurse,
+	path_excluded,
+	path_untracked
+};
+
+static enum path_treatment read_directory_recursive(struct dir_struct *dir,
+	const char *path, int len,
 	int check_only, const struct path_simplify *simplify);
 static int get_dtype(struct dirent *de, const char *path, int len);
 
@@ -958,35 +972,26 @@ static enum exist_status directory_exists_in_index(const char *dirname, int len)
  *
  *  (a) if "show_other_directories" is true, we show it as
  *      just a directory, unless "hide_empty_directories" is
- *      also true and the directory is empty, in which case
- *      we just ignore it entirely.
- *      if we are looking for ignored directories, look if it
- *      contains only ignored files to decide if it must be shown as
- *      ignored or not.
+ *      also true, in which case we need to check if it contains any
+ *      untracked and / or ignored files.
  *  (b) if it looks like a git directory, and we don't have
  *      'no_gitlinks' set we treat it as a gitlink, and show it
  *      as a directory.
  *  (c) otherwise, we recurse into it.
  */
-enum directory_treatment {
-	show_directory,
-	ignore_directory,
-	recurse_into_directory
-};
-
-static enum directory_treatment treat_directory(struct dir_struct *dir,
+static enum path_treatment treat_directory(struct dir_struct *dir,
 	const char *dirname, int len, int exclude,
 	const struct path_simplify *simplify)
 {
 	/* The "len-1" is to strip the final '/' */
 	switch (directory_exists_in_index(dirname, len-1)) {
 	case index_directory:
-		return recurse_into_directory;
+		return path_recurse;
 
 	case index_gitdir:
 		if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES)
-			return ignore_directory;
-		return show_directory;
+			return path_none;
+		return path_untracked;
 
 	case index_nonexistent:
 		if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES)
@@ -994,32 +999,17 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 		if (!(dir->flags & DIR_NO_GITLINKS)) {
 			unsigned char sha1[20];
 			if (resolve_gitlink_ref(dirname, "HEAD", sha1) == 0)
-				return show_directory;
+				return path_untracked;
 		}
-		return recurse_into_directory;
+		return path_recurse;
 	}
 
 	/* This is the "show_other_directories" case */
 
-	/*
-	 * We are looking for ignored files and our directory is not ignored,
-	 * check if it contains untracked files (i.e. is listed as untracked)
-	 */
-	if ((dir->flags & DIR_SHOW_IGNORED) && !exclude) {
-		int ignored;
-		dir->flags &= ~DIR_SHOW_IGNORED;
-		ignored = read_directory_recursive(dir, dirname, len, 1, simplify);
-		dir->flags |= DIR_SHOW_IGNORED;
-
-		if (ignored)
-			return ignore_directory;
-	}
-
 	if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES))
-		return show_directory;
-	if (!read_directory_recursive(dir, dirname, len, 1, simplify))
-		return ignore_directory;
-	return show_directory;
+		return exclude ? path_excluded : path_untracked;
+
+	return read_directory_recursive(dir, dirname, len, 1, simplify);
 }
 
 /*
@@ -1134,12 +1124,6 @@ static int get_dtype(struct dirent *de, const char *path, int len)
 	return dtype;
 }
 
-enum path_treatment {
-	path_ignored,
-	path_handled,
-	path_recurse
-};
-
 static enum path_treatment treat_one_path(struct dir_struct *dir,
 					  struct strbuf *path,
 					  const struct path_simplify *simplify,
@@ -1152,7 +1136,7 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	/* Always exclude indexed files */
 	if (dtype != DT_DIR &&
 	    cache_name_exists(path->buf, path->len, ignore_case))
-		return path_ignored;
+		return path_none;
 
 	exclude = is_excluded(dir, path->buf, &dtype);
 	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
@@ -1164,29 +1148,19 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	 * ignored files, ignore it
 	 */
 	if (exclude && !(dir->flags & DIR_SHOW_IGNORED))
-		return path_ignored;
+		return path_excluded;
 
 	switch (dtype) {
 	default:
-		return path_ignored;
+		return path_none;
 	case DT_DIR:
 		strbuf_addch(path, '/');
-		switch (treat_directory(dir, path->buf, path->len, exclude, simplify)) {
-		case show_directory:
-			break;
-		case recurse_into_directory:
-			return path_recurse;
-		case ignore_directory:
-			return path_ignored;
-		}
-		break;
+		return treat_directory(dir, path->buf, path->len, exclude,
+			simplify);
 	case DT_REG:
 	case DT_LNK:
-		if (exclude == !(dir->flags & DIR_SHOW_IGNORED))
-			return path_ignored;
-		break;
+		return exclude ? path_excluded : path_untracked;
 	}
-	return path_handled;
 }
 
 static enum path_treatment treat_path(struct dir_struct *dir,
@@ -1198,11 +1172,11 @@ static enum path_treatment treat_path(struct dir_struct *dir,
 	int dtype;
 
 	if (is_dot_or_dotdot(de->d_name) || !strcmp(de->d_name, ".git"))
-		return path_ignored;
+		return path_none;
 	strbuf_setlen(path, baselen);
 	strbuf_addstr(path, de->d_name);
 	if (simplify_away(path->buf, path->len, simplify))
-		return path_ignored;
+		return path_none;
 
 	dtype = DTYPE(de);
 	return treat_one_path(dir, path, simplify, dtype, de);
@@ -1216,14 +1190,16 @@ static enum path_treatment treat_path(struct dir_struct *dir,
  *
  * Also, we ignore the name ".git" (even if it is not a directory).
  * That likely will not change.
+ *
+ * Returns the most significant path_treatment value encountered in the scan.
  */
-static int read_directory_recursive(struct dir_struct *dir,
+static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 				    const char *base, int baselen,
 				    int check_only,
 				    const struct path_simplify *simplify)
 {
 	DIR *fdir;
-	int contents = 0;
+	enum path_treatment state, subdir_state, dir_state = path_none;
 	struct dirent *de;
 	struct strbuf path = STRBUF_INIT;
 
@@ -1234,26 +1210,48 @@ static int read_directory_recursive(struct dir_struct *dir,
 		goto out;
 
 	while ((de = readdir(fdir)) != NULL) {
-		switch (treat_path(dir, de, &path, baselen, simplify)) {
-		case path_recurse:
-			contents += read_directory_recursive(dir, path.buf,
+		/* check how the file or directory should be treated */
+		state = treat_path(dir, de, &path, baselen, simplify);
+		if (state > dir_state)
+			dir_state = state;
+
+		/* recurse into subdir if instructed by treat_path */
+		if (state == path_recurse) {
+			subdir_state = read_directory_recursive(dir, path.buf,
 				path.len, check_only, simplify);
+			if (subdir_state > dir_state)
+				dir_state = subdir_state;
+		}
+
+		if (check_only) {
+			/* abort early if maximum state has been reached */
+			if (dir_state == path_untracked)
+				break;
+			/* skip the dir_add_* part */
 			continue;
-		case path_ignored:
-			continue;
-		case path_handled:
-			break;
 		}
-		contents++;
-		if (check_only)
+
+		/* add the path to the appropriate result list */
+		switch (state) {
+		case path_excluded:
+			if (dir->flags & DIR_SHOW_IGNORED)
+				dir_add_name(dir, path.buf, path.len);
+			break;
+
+		case path_untracked:
+			if (!(dir->flags & DIR_SHOW_IGNORED))
+				dir_add_name(dir, path.buf, path.len);
 			break;
-		dir_add_name(dir, path.buf, path.len);
+
+		default:
+			break;
+		}
 	}
 	closedir(fdir);
  out:
 	strbuf_release(&path);
 
-	return contents;
+	return dir_state;
 }
 
 static int cmp_name(const void *p1, const void *p2)
@@ -1324,7 +1322,7 @@ static int treat_leading_path(struct dir_struct *dir,
 		if (simplify_away(sb.buf, sb.len, simplify))
 			break;
 		if (treat_one_path(dir, &sb, simplify,
-				   DT_DIR, NULL) == path_ignored)
+				   DT_DIR, NULL) == path_none)
 			break; /* do not recurse into it */
 		if (len <= baselen) {
 			rc = 1;
-- 
1.8.1.2.8026.g2b66448.dirty

  parent reply	other threads:[~2013-04-15 19:14 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-18 20:28 [PATCH 0/8] Improve git-status --ignored Karsten Blees
2013-03-19  4:08 ` Junio C Hamano
2013-03-19  5:20   ` Duy Nguyen
2013-03-19 10:48     ` Karsten Blees
2013-03-19 14:48     ` Junio C Hamano
2013-03-19 15:58       ` Duy Nguyen
2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
2013-04-15 19:05   ` [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories Karsten Blees
2013-04-16 17:33     ` Ramkumar Ramachandra
2013-04-17  0:31       ` Karsten Blees
2013-04-15 19:06   ` [PATCH v2 02/14] dir.c: git-status --ignored: don't list files in " Karsten Blees
2013-04-16  9:57     ` [PATCH] read_revisions_from_stdin: make copies for handle_revision_arg Thomas Rast
2013-04-16 18:17       ` Junio C Hamano
2013-04-15 19:07   ` [PATCH v2 03/14] dir.c: git-status --ignored: don't list empty ignored directories Karsten Blees
2013-04-16 17:48     ` Ramkumar Ramachandra
2013-04-17  0:31       ` Karsten Blees
2013-04-15 19:08   ` [PATCH v2 04/14] dir.c: git-ls-files --directories: don't hide empty directories Karsten Blees
2013-04-15 19:08   ` [PATCH v2 05/14] dir.c: git-status --ignored: don't list empty directories as ignored Karsten Blees
2013-04-15 19:09   ` [PATCH v2 06/14] dir.c: make 'git-status --ignored' work within leading directories Karsten Blees
2013-04-15 19:10   ` [PATCH v2 07/14] dir.c: git-clean -d -X: don't delete tracked directories Karsten Blees
2013-04-15 19:11   ` [PATCH v2 08/14] dir.c: factor out parts of last_exclude_matching for later reuse Karsten Blees
2013-04-15 19:11   ` [PATCH v2 09/14] dir.c: move prep_exclude Karsten Blees
2013-04-15 19:12   ` [PATCH v2 10/14] dir.c: unify is_excluded and is_path_excluded APIs Karsten Blees
2013-04-15 21:35     ` Junio C Hamano
2013-04-15 19:12   ` [PATCH v2 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
2013-04-15 19:13   ` [PATCH v2 12/14] dir.c: git-status: avoid is_excluded checks for tracked files Karsten Blees
2013-04-15 19:14   ` Karsten Blees [this message]
2013-04-15 19:15   ` [PATCH v2 14/14] dir.c: git-status --ignored: don't scan the work tree twice Karsten Blees
2013-04-15 19:23   ` [PATCH v2 00/14] Improve git-status --ignored Junio C Hamano
2013-04-15 19:33     ` Junio C Hamano
2013-04-15 20:06       ` Karsten Blees
2013-04-15 20:25         ` Junio C Hamano
2013-04-17 19:50           ` Karsten Blees
2013-04-17 22:03             ` Junio C Hamano
2013-04-17 19:50           ` [PATCH v2-pu 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
2013-04-17 19:51           ` [PATCH v2-pu 14/14] dir.c: git-status --ignored: don't scan the work tree twice Karsten Blees

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516C518E.1000405@gmail.com \
    --to=karsten.blees@gmail.com \
    --cc=apelisse@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=git@adamspiers.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kusmabite@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=robert.allan.zeh@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).