git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Karsten Blees <karsten.blees@gmail.com>
To: Git List <git@vger.kernel.org>
Cc: Junio C Hamano <gitster@pobox.com>,
	Erik Faye-Lund <kusmabite@gmail.com>,
	Ramkumar Ramachandra <artagnon@gmail.com>,
	Robert Zeh <robert.allan.zeh@gmail.com>,
	Duy Nguyen <pclouds@gmail.com>,
	Antoine Pelisse <apelisse@gmail.com>,
	Adam Spiers <git@adamspiers.org>
Subject: [PATCH 8/8] dir.c: git-status: avoid is_excluded checks for tracked files
Date: Mon, 18 Mar 2013 21:29:27 +0100	[thread overview]
Message-ID: <51477927.9090500@gmail.com> (raw)
In-Reply-To: <514775FA.9080304@gmail.com>

Checking if a file is in the index is much faster (hashtable lookup) than
checking if the file is excluded (linear search over exclude patterns).

Skip is_excluded checks for files: move the cache_name_exists check from
treat_file to treat_one_path and return early if the file is tracked.

This can safely be done as all other code paths also return path_ignored
for tracked files, and dir_add_ignored skips tracked files as well.

There's just one line left in treat_file, so move this to treat_one_path
as well.

Here's some performance data for git-status from the linux and WebKit
repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true):

       |    status      | status --ignored
       | linux | WebKit | linux | WebKit
-------+-------+--------+-------+---------
before | 0.218 |  1.583 | 0.321 |  2.579
after  | 0.156 |  0.988 | 0.202 |  1.279
gain   | 1.397 |  1.602 | 1.589 |  2.016

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c | 38 +++++++++++---------------------------
 1 file changed, 11 insertions(+), 27 deletions(-)

diff --git a/dir.c b/dir.c
index 086a169..c159000 100644
--- a/dir.c
+++ b/dir.c
@@ -1026,28 +1026,6 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 }
 
 /*
- * Decide what to do when we find a file while traversing the
- * filesystem. Mostly two cases:
- *
- *  1. We are looking for ignored files
- *   (a) File is ignored, include it
- *   (b) File is in ignored path, include it
- *   (c) File is not ignored, exclude it
- *
- *  2. Other scenarios, include the file if not excluded
- *
- * Return 1 for exclude, 0 for include.
- */
-static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude)
-{
-	/* Always exclude indexed files */
-	if (index_name_exists(&the_index, path->buf, path->len, ignore_case))
-		return 1;
-
-	return exclude == !(dir->flags & DIR_SHOW_IGNORED);
-}
-
-/*
  * This is an inexact early pruning of any recursive directory
  * reading - if the path cannot possibly be in the pathspec,
  * return true, and we'll skip it early.
@@ -1170,7 +1148,16 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 					  const struct path_simplify *simplify,
 					  int dtype, struct dirent *de)
 {
-	int exclude = is_excluded(dir, path->buf, &dtype);
+	int exclude;
+	if (dtype == DT_UNKNOWN)
+		dtype = get_dtype(de, path->buf, path->len);
+
+	/* Always exclude indexed files */
+	if (dtype != DT_DIR &&
+	    cache_name_exists(path->buf, path->len, ignore_case))
+		return path_ignored;
+
+	exclude = is_excluded(dir, path->buf, &dtype);
 	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
 	    && exclude_matches_pathspec(path->buf, path->len, simplify))
 		dir_add_ignored(dir, path->buf, path->len);
@@ -1182,9 +1169,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	if (exclude && !(dir->flags & DIR_SHOW_IGNORED))
 		return path_ignored;
 
-	if (dtype == DT_UNKNOWN)
-		dtype = get_dtype(de, path->buf, path->len);
-
 	switch (dtype) {
 	default:
 		return path_ignored;
@@ -1201,7 +1185,7 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 		break;
 	case DT_REG:
 	case DT_LNK:
-		if (treat_file(dir, path, exclude))
+		if (exclude == !(dir->flags & DIR_SHOW_IGNORED))
 			return path_ignored;
 		break;
 	}
-- 
1.8.1.2.8021.g7e51819

      parent reply	other threads:[~2013-03-18 20:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <514775FA.9080304@gmail.com>
2013-03-18 20:28 ` [PATCH 1/8] dir.c: git-status --ignored: don't drop ignored directories Karsten Blees
2013-03-18 20:28 ` [PATCH 2/8] dir.c: git-status --ignored: don't list files in " Karsten Blees
2013-03-18 20:28 ` [PATCH 3/8] dir.c: git-status --ignored: don't list empty " Karsten Blees
2013-03-18 20:28 ` [PATCH 4/8] dir.c: git-status --ignored: don't list empty directories as ignored Karsten Blees
2013-03-18 21:59   ` Eric Sunshine
2013-03-18 20:28 ` [PATCH 5/8] dir.c: move prep_exclude and factor out parts of last_exclude_matching Karsten Blees
2013-03-18 20:29 ` [PATCH 6/8] dir.c: unify is_excluded and is_path_excluded APIs Karsten Blees
2013-03-18 22:00   ` Eric Sunshine
2013-03-18 20:29 ` [PATCH 7/8] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
2013-03-18 20:29 ` Karsten Blees [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51477927.9090500@gmail.com \
    --to=karsten.blees@gmail.com \
    --cc=apelisse@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=git@adamspiers.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kusmabite@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=robert.allan.zeh@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).