* [PATCH] read_directory: avoid invoking exclude machinery on tracked files @ 2013-02-15 14:17 Nguyễn Thái Ngọc Duy 2013-02-15 16:52 ` Junio C Hamano 2013-02-16 7:17 ` [PATCH v2] " Nguyễn Thái Ngọc Duy 0 siblings, 2 replies; 12+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2013-02-15 14:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag, Nguyễn Thái Ngọc Duy read_directory() (and its friendly wrapper fill_directory) collects untracked/ignored files by traversing through the whole worktree (*), feeding every entry to treat_one_path(), where each entry is checked against .gitignore patterns. One may see that tracked files can't be excluded and we do not need to run them through exclude machinery. On repos where there are many .gitignore patterns and/or a lot of tracked files, this unnecessary processing can become expensive. This patch avoids it mostly for normal cases. Directories are still processed as before. DIR_SHOW_IGNORED and DIR_COLLECT_IGNORED are not normally used unless some options are given (e.g. "checkout --overwrite-ignore", "add -f"...) so people still need to pay penalty in some cases, just not as often as before. git status | webkit linux-2.6 libreoffice-core gentoo-x86 -------------+---------------------------------------------- before | 1.159s 0.226s 0.415s 0.597s after | 0.778s 0.176s 0.266s 0.556s nr. patterns | 89 376 19 0 nr. tracked | 182k 40k 63k 101k (*) Not completely true. read_directory may skip recursing into a directory if it's entirely excluded and DIR_SHOW_OTHER_DIRECTORIES is not set. Tracked-down-by: Karsten Blees <karsten.blees@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- For reference: http://thread.gmane.org/gmane.comp.version-control.git/215820/focus=216195 dir.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/dir.c b/dir.c index 57394e4..bdff256 100644 --- a/dir.c +++ b/dir.c @@ -1244,7 +1244,19 @@ static enum path_treatment treat_one_path(struct dir_struct *dir, const struct path_simplify *simplify, int dtype, struct dirent *de) { - int exclude = is_excluded(dir, path->buf, &dtype); + int exclude; + + if (dtype == DT_UNKNOWN) + dtype = get_dtype(de, path->buf, path->len); + + if (!(dir->flags & DIR_SHOW_IGNORED) && + !(dir->flags & DIR_COLLECT_IGNORED) && + dtype != DT_DIR && + cache_name_exists(path->buf, path->len, ignore_case)) + return path_ignored; + + exclude = is_excluded(dir, path->buf, &dtype); + if (exclude && (dir->flags & DIR_COLLECT_IGNORED) && exclude_matches_pathspec(path->buf, path->len, simplify)) dir_add_ignored(dir, path->buf, path->len); @@ -1256,9 +1268,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir, if (exclude && !(dir->flags & DIR_SHOW_IGNORED)) return path_ignored; - if (dtype == DT_UNKNOWN) - dtype = get_dtype(de, path->buf, path->len); - switch (dtype) { default: return path_ignored; -- 1.8.1.2.536.gf441e6d ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] read_directory: avoid invoking exclude machinery on tracked files 2013-02-15 14:17 [PATCH] read_directory: avoid invoking exclude machinery on tracked files Nguyễn Thái Ngọc Duy @ 2013-02-15 16:52 ` Junio C Hamano 2013-02-15 18:30 ` Duy Nguyen 2013-02-16 7:17 ` [PATCH v2] " Nguyễn Thái Ngọc Duy 1 sibling, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2013-02-15 16:52 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy Cc: git, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > read_directory() (and its friendly wrapper fill_directory) collects > untracked/ignored files by traversing through the whole worktree (*), > feeding every entry to treat_one_path(), where each entry is checked > against .gitignore patterns. > > One may see that tracked files can't be excluded and we do not need to > run them through exclude machinery. On repos where there are many > .gitignore patterns and/or a lot of tracked files, this unnecessary > processing can become expensive. > > This patch avoids it mostly for normal cases. Directories are still > processed as before. DIR_SHOW_IGNORED and DIR_COLLECT_IGNORED are not > normally used unless some options are given (e.g. "checkout > --overwrite-ignore", "add -f"...) so people still need to pay penalty > in some cases, just not as often as before. > > git status | webkit linux-2.6 libreoffice-core gentoo-x86 > -------------+---------------------------------------------- > before | 1.159s 0.226s 0.415s 0.597s > after | 0.778s 0.176s 0.266s 0.556s > nr. patterns | 89 376 19 0 > nr. tracked | 182k 40k 63k 101k > > (*) Not completely true. read_directory may skip recursing into a > directory if it's entirely excluded and DIR_SHOW_OTHER_DIRECTORIES > is not set. > > Tracked-down-by: Karsten Blees <karsten.blees@gmail.com> > Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> > --- > For reference: > http://thread.gmane.org/gmane.comp.version-control.git/215820/focus=216195 > > dir.c | 17 +++++++++++++---- > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/dir.c b/dir.c > index 57394e4..bdff256 100644 > --- a/dir.c > +++ b/dir.c > @@ -1244,7 +1244,19 @@ static enum path_treatment treat_one_path(struct dir_struct *dir, > const struct path_simplify *simplify, > int dtype, struct dirent *de) > { > - int exclude = is_excluded(dir, path->buf, &dtype); > + int exclude; > + > + if (dtype == DT_UNKNOWN) > + dtype = get_dtype(de, path->buf, path->len); > + > + if (!(dir->flags & DIR_SHOW_IGNORED) && > + !(dir->flags & DIR_COLLECT_IGNORED) && > + dtype != DT_DIR && > + cache_name_exists(path->buf, path->len, ignore_case)) > + return path_ignored; > + > + exclude = is_excluded(dir, path->buf, &dtype); > + > if (exclude && (dir->flags & DIR_COLLECT_IGNORED) > && exclude_matches_pathspec(path->buf, path->len, simplify)) > dir_add_ignored(dir, path->buf, path->len); Interesting. In the current code, we always check if a path is excluded, and when dealing with DT_REG/DT_LNK, we call treat_file(): * When such a path is excluded, treat_file() returns true when we are not showing ignored directories. This causes treat_one_path() to return path_ignored, so for excluded DT_REG/DT_LNK paths when no DIR_*_IGNORED is in effect, this change is a correct optimization. * When such a path is not excluded, on the ther hand, and when we are not showing ignored directories, treat_file() just returns the value of exclude_file, which is initialized to false and is not changed in the function. This causes treat_one_path() to return path_handled. However, the new code returns path_ignored in this case. What guarantees that this change is regression free? I do not seem to be able to find anything that checks if the path is already known to the index in the original code for the case you special cased (i.e. DIR_*_IGNORED is not set and dtype is not DT_DIR). Do all the callers that reach this function in their callgraph, when they get path_ignored for a path in the index, behave as if the difference between path_ignored and path_handled does not matter? > @@ -1256,9 +1268,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir, > if (exclude && !(dir->flags & DIR_SHOW_IGNORED)) > return path_ignored; > > - if (dtype == DT_UNKNOWN) > - dtype = get_dtype(de, path->buf, path->len); > - > switch (dtype) { > default: > return path_ignored; ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] read_directory: avoid invoking exclude machinery on tracked files 2013-02-15 16:52 ` Junio C Hamano @ 2013-02-15 18:30 ` Duy Nguyen 2013-02-15 19:32 ` Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Duy Nguyen @ 2013-02-15 18:30 UTC (permalink / raw) To: Junio C Hamano Cc: git, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag On Fri, Feb 15, 2013 at 11:52 PM, Junio C Hamano <gitster@pobox.com> wrote: > In the current code, we always check if a path is excluded, and when > dealing with DT_REG/DT_LNK, we call treat_file(): > > * When such a path is excluded, treat_file() returns true when we > are not showing ignored directories. This causes treat_one_path() > to return path_ignored, so for excluded DT_REG/DT_LNK paths when > no DIR_*_IGNORED is in effect, this change is a correct > optimization. > > * When such a path is not excluded, on the ther hand, and when we > are not showing ignored directories, treat_file() just returns > the value of exclude_file, which is initialized to false and is > not changed in the function. This causes treat_one_path() to > return path_handled. However, the new code returns path_ignored > in this case. > > What guarantees that this change is regression free? If you consider read_directory_recursive alone, there is a regression. The return value of r_d_r depends on path_handled/path_ignored. With this patch, the return value will be different. The return value is only used by treat_directory() in two cases: - when DIR_SHOW_IGNORED is set, which disables the optimization so no regression - when DIR_HIDE_EMPTY_DIRECTORIES is _not_ set (and neither is DIR_SHOW_IGNORED), the optimization is still on and different r_d_r's return value would lead to different behavior. However I don't think it can happen. treat_directory checks if the given directory can be found in index. In that case the neither r_d_r calls in treat_directory is reachable. If the given directory cannot be found in the index, the second r_d_r is reachable. But then the cache_name_exists() in the patch should always be false (parent not in index, children cannot), so the optimization is off and r_d_r returns correctly. It's a bit tricky. I'm not sure if I miss anything else. > I do not seem > to be able to find anything that checks if the path is already known > to the index in the original code for the case you special cased > (i.e. DIR_*_IGNORED is not set and dtype is not DT_DIR). Do all the > callers that reach this function in their callgraph, when they get > path_ignored for a path in the index, behave as if the difference > between path_ignored and path_handled does not matter? -- Duy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] read_directory: avoid invoking exclude machinery on tracked files 2013-02-15 18:30 ` Duy Nguyen @ 2013-02-15 19:32 ` Junio C Hamano 2013-02-16 3:31 ` Duy Nguyen 2013-02-18 16:42 ` Karsten Blees 0 siblings, 2 replies; 12+ messages in thread From: Junio C Hamano @ 2013-02-15 19:32 UTC (permalink / raw) To: Duy Nguyen Cc: git, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag Duy Nguyen <pclouds@gmail.com> writes: > On Fri, Feb 15, 2013 at 11:52 PM, Junio C Hamano <gitster@pobox.com> wrote: >> In the current code, we always check if a path is excluded, and when >> dealing with DT_REG/DT_LNK, we call treat_file(): >> >> * When such a path is excluded, treat_file() returns true when we >> are not showing ignored directories. This causes treat_one_path() >> to return path_ignored, so for excluded DT_REG/DT_LNK paths when >> no DIR_*_IGNORED is in effect, this change is a correct >> optimization. >> >> * When such a path is not excluded, on the ther hand, and when we >> are not showing ignored directories, treat_file() just returns >> the value of exclude_file, which is initialized to false and is >> not changed in the function. This causes treat_one_path() to >> return path_handled. However, the new code returns path_ignored >> in this case. >> >> What guarantees that this change is regression free? > > If you consider read_directory_recursive alone, there is a regression. > The return value of r_d_r depends on path_handled/path_ignored. With > this patch, the return value will be different. That is exactly what was missing from the proposed log message, and made me ask "Do all the callers that reach this function in their callgraph, when they get path_ignored for a path in the index, behave as if the difference between path_ignored and path_handled does not matter?" Your answer seems to be - r-d-r returns 'how many paths in this directory match the criteria we are looking for', unless check_only is true. Now in some cases we return path_ignored not path_handled, so we may return a number that is greater than we used to return. - treat_directory, the only user of that return value, cares if r-d-r returned 0 or non-zero; and - As long as we keep returning 0 from r-d-r in cases we used to return 0 and non-zero in cases we used to return non-zero, exact number does not matter. Overall we get the same result. I think all of the above is true, but I have not convinced myself that r-d-r with the new code never returns 0 when we used to return non-zero. > ... > It's a bit tricky. I'm not sure if I miss anything else. Hrm... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] read_directory: avoid invoking exclude machinery on tracked files 2013-02-15 19:32 ` Junio C Hamano @ 2013-02-16 3:31 ` Duy Nguyen 2013-02-18 16:42 ` Karsten Blees 1 sibling, 0 replies; 12+ messages in thread From: Duy Nguyen @ 2013-02-16 3:31 UTC (permalink / raw) To: Junio C Hamano Cc: git, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag On Sat, Feb 16, 2013 at 2:32 AM, Junio C Hamano <gitster@pobox.com> wrote: >> If you consider read_directory_recursive alone, there is a regression. >> The return value of r_d_r depends on path_handled/path_ignored. With >> this patch, the return value will be different. > > That is exactly what was missing from the proposed log message, and > made me ask "Do all the callers that reach this function in their > callgraph, when they get path_ignored for a path in the index, > behave as if the difference between path_ignored and path_handled > does not matter?" I'll add it the the log message. Although I'm thinking some restructuring to separate tracked file handling from the rest may make it clearer (and less error prone in future). -- Duy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] read_directory: avoid invoking exclude machinery on tracked files 2013-02-15 19:32 ` Junio C Hamano 2013-02-16 3:31 ` Duy Nguyen @ 2013-02-18 16:42 ` Karsten Blees 1 sibling, 0 replies; 12+ messages in thread From: Karsten Blees @ 2013-02-18 16:42 UTC (permalink / raw) To: Junio C Hamano Cc: Duy Nguyen, git, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag Am 15.02.2013 20:32, schrieb Junio C Hamano: > Duy Nguyen <pclouds@gmail.com> writes: > >> On Fri, Feb 15, 2013 at 11:52 PM, Junio C Hamano <gitster@pobox.com> wrote: >>> In the current code, we always check if a path is excluded, and when >>> dealing with DT_REG/DT_LNK, we call treat_file(): >>> >>> * When such a path is excluded, treat_file() returns true when we >>> are not showing ignored directories. This causes treat_one_path() >>> to return path_ignored, so for excluded DT_REG/DT_LNK paths when >>> no DIR_*_IGNORED is in effect, this change is a correct >>> optimization. >>> >>> * When such a path is not excluded, on the ther hand, and when we >>> are not showing ignored directories, treat_file() just returns >>> the value of exclude_file, which is initialized to false and is >>> not changed in the function. This causes treat_one_path() to >>> return path_handled. However, the new code returns path_ignored >>> in this case. >>> >>> What guarantees that this change is regression free? >> >> If you consider read_directory_recursive alone, there is a regression. >> The return value of r_d_r depends on path_handled/path_ignored. With >> this patch, the return value will be different. > > That is exactly what was missing from the proposed log message, and > made me ask "Do all the callers that reach this function in their > callgraph, when they get path_ignored for a path in the index, > behave as if the difference between path_ignored and path_handled > does not matter?" Your answer seems to be > > - r-d-r returns 'how many paths in this directory match the > criteria we are looking for', unless check_only is true. Now in > some cases we return path_ignored not path_handled, so we may > return a number that is greater than we used to return. > > - treat_directory, the only user of that return value, cares if > r-d-r returned 0 or non-zero; and > > - As long as we keep returning 0 from r-d-r in cases we used to > return 0 and non-zero in cases we used to return non-zero, exact > number does not matter. Overall we get the same result. > > I think all of the above is true, but I have not convinced myself > that r-d-r with the new code never returns 0 when we used to return > non-zero. > treat_directory calls read_directory_recursive in tow cases: 1.) The directory is not in the index. ---8<--- switch (directory_exists_in_index(dirname, len-1)) { case index_nonexistent: if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES) break; } ... ---8<--- The directory is not in the index if there are no tracked files in the directory. I.e. cache_name_exists will always be false in this case, so the change won't affect the result of r_d_r. 2.) The directory is in the index but is ignored. ---8<--- switch (directory_exists_in_index(dirname, len-1)) { case index_directory: if ((dir->flags & DIR_SHOW_OTHER_DIRECTORIES) && exclude) break; } if ((dir->flags & DIR_SHOW_IGNORED) && !exclude) { ... } if (!(dir->flags & DIR_SHOW_IGNORED) && !(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) return show_directory; if (!read_directory_recursive(dir, dirname, len, 1, simplify)) return ignore_directory; return show_directory; ---8<--- With exclude==true, only one r_d_r call is reachable, and only if either DIR_SHOW_IGNORED or DIR_HIDE_EMPTY_DIRECTORIES is set. 2a) DIR_SHOW_IGNORED is set: the patch already checks !(dir->flags & DIR_SHOW_IGNORED), so the result of r_d_r is not affected. 2b) DIR_HIDE_EMPTY_DIRECTORIES is set and DIR_SHOW_IGNORED is not set: the directory is already ignored, so all files in the directory should be ignored, too. It doesn't matter whether treat_one_path returns path_ignored because of the excluded() check or cache_name_exists(). Therefore, I think the patch (v0) is regression-free. As a side note, I'm quite confused why we would ever want to evaluate .gitignore patterns on tracked files at all, as gitignore(5) clearly states "Files already tracked by git are not affected". There is 'git-ls-files --cached --ignored', although this doesn't seem to process .gitignore files but expects exclude patterns on the command line... ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files 2013-02-15 14:17 [PATCH] read_directory: avoid invoking exclude machinery on tracked files Nguyễn Thái Ngọc Duy 2013-02-15 16:52 ` Junio C Hamano @ 2013-02-16 7:17 ` Nguyễn Thái Ngọc Duy 2013-02-16 18:11 ` Pete Wyckoff ` (2 more replies) 1 sibling, 3 replies; 12+ messages in thread From: Nguyễn Thái Ngọc Duy @ 2013-02-16 7:17 UTC (permalink / raw) To: git Cc: Junio C Hamano, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag, Nguyễn Thái Ngọc Duy read_directory() (and its friendly wrapper fill_directory) collects untracked/ignored files by traversing through the whole worktree, feeding every entry to treat_one_path(), where each entry is checked against .gitignore patterns. One may see that tracked files can't be excluded and we do not need to run them through exclude machinery. On repos where there are many .gitignore patterns and/or a lot of tracked files, this unnecessary processing can become expensive. This patch avoids it mostly for normal cases. Directories are still processed as before. DIR_SHOW_IGNORED and DIR_COLLECT_IGNORED are not normally used unless some options are given (e.g. "checkout --overwrite-ignore", "add -f"...) treat_one_path's behavior changes when taking this shortcut. With current code, when a non-directory path is not excluded, treat_one_path calls treat_file, which returns the initial value of exclude_file and causes treat_one_path to return path_handled. With this patch, on the same conditions, treat_one_path returns path_ignored. read_directory_recursive() cares about this difference. Check out the snippet: while (...) { switch (treat_path(...)) { case path_ignored: continue; case path_handled: break; } contents++; if (check_only) break; dir_add_name(dir, path.buf, path.len); } If path_handled is returned, contents goes up. And if check_only is true, the loop could be broken early. These will not happen when treat_one_path (and its wrapper treat_path) returns path_ignored. dir_add_name internally does a cache_name_exists() check so it makes no difference. To avoid this behavior change, treat_one_path is instructed to skip the optimization when check_only or contents is used. Finally some numbers (best of 20 runs) that shows why it's worth all the hassle: git status | webkit linux-2.6 libreoffice-core gentoo-x86 -------------+---------------------------------------------- before | 1.097s 0.208s 0.399s 0.539s after | 0.736s 0.159s 0.248s 0.501s nr. patterns | 89 376 19 0 nr. tracked | 182k 40k 63k 101k Tracked-down-by: Karsten Blees <karsten.blees@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> --- Instead of relying on the surrounding code happening to not trigger the behavior change in treat_one_path, this round ensures such triggers will disable the optimization and fall back to normal code path. There are no big differences in measured numbers, which indicate incorrect triggers do not happen, at least in my tests. dir.c | 79 ++++++++++++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 52 insertions(+), 27 deletions(-) diff --git a/dir.c b/dir.c index 57394e4..a5fe0a0 100644 --- a/dir.c +++ b/dir.c @@ -17,8 +17,11 @@ struct path_simplify { const char *path; }; -static int read_directory_recursive(struct dir_struct *dir, const char *path, int len, - int check_only, const struct path_simplify *simplify); +static void read_directory_recursive(struct dir_struct *dir, + const char *path, int len, + int check_only, + const struct path_simplify *simplify, + int *contents); static int get_dtype(struct dirent *de, const char *path, int len); /* helper string functions with support for the ignore_case flag */ @@ -1034,6 +1037,7 @@ static enum directory_treatment treat_directory(struct dir_struct *dir, const char *dirname, int len, int exclude, const struct path_simplify *simplify) { + int contents = 0; /* The "len-1" is to strip the final '/' */ switch (directory_exists_in_index(dirname, len-1)) { case index_directory: @@ -1065,19 +1069,19 @@ static enum directory_treatment treat_directory(struct dir_struct *dir, * check if it contains only ignored files */ if ((dir->flags & DIR_SHOW_IGNORED) && !exclude) { - int ignored; dir->flags &= ~DIR_SHOW_IGNORED; dir->flags |= DIR_HIDE_EMPTY_DIRECTORIES; - ignored = read_directory_recursive(dir, dirname, len, 1, simplify); + read_directory_recursive(dir, dirname, len, 1, simplify, &contents); dir->flags &= ~DIR_HIDE_EMPTY_DIRECTORIES; dir->flags |= DIR_SHOW_IGNORED; - return ignored ? ignore_directory : show_directory; + return contents ? ignore_directory : show_directory; } if (!(dir->flags & DIR_SHOW_IGNORED) && !(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) return show_directory; - if (!read_directory_recursive(dir, dirname, len, 1, simplify)) + read_directory_recursive(dir, dirname, len, 1, simplify, &contents); + if (!contents) return ignore_directory; return show_directory; } @@ -1242,9 +1246,23 @@ enum path_treatment { static enum path_treatment treat_one_path(struct dir_struct *dir, struct strbuf *path, const struct path_simplify *simplify, - int dtype, struct dirent *de) + int dtype, struct dirent *de, + int exclude_shortcut_ok) { - int exclude = is_excluded(dir, path->buf, &dtype); + int exclude; + + if (dtype == DT_UNKNOWN) + dtype = get_dtype(de, path->buf, path->len); + + if (exclude_shortcut_ok && + !(dir->flags & DIR_SHOW_IGNORED) && + !(dir->flags & DIR_COLLECT_IGNORED) && + dtype != DT_DIR && + cache_name_exists(path->buf, path->len, ignore_case)) + return path_ignored; + + exclude = is_excluded(dir, path->buf, &dtype); + if (exclude && (dir->flags & DIR_COLLECT_IGNORED) && exclude_matches_pathspec(path->buf, path->len, simplify)) dir_add_ignored(dir, path->buf, path->len); @@ -1256,9 +1274,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir, if (exclude && !(dir->flags & DIR_SHOW_IGNORED)) return path_ignored; - if (dtype == DT_UNKNOWN) - dtype = get_dtype(de, path->buf, path->len); - switch (dtype) { default: return path_ignored; @@ -1290,7 +1305,8 @@ static enum path_treatment treat_path(struct dir_struct *dir, struct dirent *de, struct strbuf *path, int baselen, - const struct path_simplify *simplify) + const struct path_simplify *simplify, + int exclude_shortcut_ok) { int dtype; @@ -1302,7 +1318,7 @@ static enum path_treatment treat_path(struct dir_struct *dir, return path_ignored; dtype = DTYPE(de); - return treat_one_path(dir, path, simplify, dtype, de); + return treat_one_path(dir, path, simplify, dtype, de, exclude_shortcut_ok); } /* @@ -1314,13 +1330,13 @@ static enum path_treatment treat_path(struct dir_struct *dir, * Also, we ignore the name ".git" (even if it is not a directory). * That likely will not change. */ -static int read_directory_recursive(struct dir_struct *dir, - const char *base, int baselen, - int check_only, - const struct path_simplify *simplify) +static void read_directory_recursive(struct dir_struct *dir, + const char *base, int baselen, + int check_only, + const struct path_simplify *simplify, + int *contents) { DIR *fdir; - int contents = 0; struct dirent *de; struct strbuf path = STRBUF_INIT; @@ -1331,18 +1347,29 @@ static int read_directory_recursive(struct dir_struct *dir, goto out; while ((de = readdir(fdir)) != NULL) { - switch (treat_path(dir, de, &path, baselen, simplify)) { + switch (treat_path(dir, de, &path, baselen, + simplify, + !check_only && !contents)) { case path_recurse: - contents += read_directory_recursive(dir, path.buf, - path.len, 0, - simplify); + read_directory_recursive(dir, path.buf, + path.len, 0, + simplify, + contents); continue; case path_ignored: continue; case path_handled: break; } - contents++; + /* + * Update the last argument to treat_path if anything + * else is done after this point. This is because if + * treat_path's exclude_shortcut_ok is true, it may + * incorrectly return path_ignored (and never reaches + * this part) instead of path_handled. + */ + if (contents) + (*contents)++; if (check_only) break; dir_add_name(dir, path.buf, path.len); @@ -1350,8 +1377,6 @@ static int read_directory_recursive(struct dir_struct *dir, closedir(fdir); out: strbuf_release(&path); - - return contents; } static int cmp_name(const void *p1, const void *p2) @@ -1420,7 +1445,7 @@ static int treat_leading_path(struct dir_struct *dir, if (simplify_away(sb.buf, sb.len, simplify)) break; if (treat_one_path(dir, &sb, simplify, - DT_DIR, NULL) == path_ignored) + DT_DIR, NULL, 0) == path_ignored) break; /* do not recurse into it */ if (len <= baselen) { rc = 1; @@ -1440,7 +1465,7 @@ int read_directory(struct dir_struct *dir, const char *path, int len, const char simplify = create_simplify(pathspec); if (!len || treat_leading_path(dir, path, len, simplify)) - read_directory_recursive(dir, path, len, 0, simplify); + read_directory_recursive(dir, path, len, 0, simplify, NULL); free_simplify(simplify); qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name); qsort(dir->ignored, dir->ignored_nr, sizeof(struct dir_entry *), cmp_name); -- 1.8.1.2.536.gf441e6d ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files 2013-02-16 7:17 ` [PATCH v2] " Nguyễn Thái Ngọc Duy @ 2013-02-16 18:11 ` Pete Wyckoff 2013-02-17 4:39 ` Duy Nguyen 2013-02-17 23:18 ` Junio C Hamano 2013-02-25 22:01 ` [PATCH/RFC] dir.c: Make git-status --ignored even more consistent Karsten Blees 2 siblings, 1 reply; 12+ messages in thread From: Pete Wyckoff @ 2013-02-16 18:11 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy Cc: git, Junio C Hamano, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag pclouds@gmail.com wrote on Sat, 16 Feb 2013 14:17 +0700: > Finally some numbers (best of 20 runs) that shows why it's worth all > the hassle: > > git status | webkit linux-2.6 libreoffice-core gentoo-x86 > -------------+---------------------------------------------- > before | 1.097s 0.208s 0.399s 0.539s > after | 0.736s 0.159s 0.248s 0.501s > nr. patterns | 89 376 19 0 > nr. tracked | 182k 40k 63k 101k Thanks for this work. I repeated some of the tests across NFS, where I'd expect to see bigger differences. Best of 20 values reported in "min ...". webkit Stock min 9.61 avg 11.61 +/- 1.35 max 14.26 Duy min 6.91 avg 7.67 +/- 0.46 max 8.71 linux Stock min 2.27 avg 3.16 +/- 0.56 max 4.49 Duy min 2.04 avg 3.12 +/- 0.69 max 4.87 libreoffice-core Stock min 4.56 avg 5.79 +/- 0.79 max 7.08 Duy min 3.96 avg 5.25 +/- 0.95 max 6.95 Similar 30%-ish speedup on webkit. And an absolute gain of 2.7 seconds is quite nice. -- Pete ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files 2013-02-16 18:11 ` Pete Wyckoff @ 2013-02-17 4:39 ` Duy Nguyen 2013-02-17 15:49 ` Pete Wyckoff 0 siblings, 1 reply; 12+ messages in thread From: Duy Nguyen @ 2013-02-17 4:39 UTC (permalink / raw) To: Pete Wyckoff Cc: git, Junio C Hamano, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag On Sun, Feb 17, 2013 at 1:11 AM, Pete Wyckoff <pw@padd.com> wrote: > pclouds@gmail.com wrote on Sat, 16 Feb 2013 14:17 +0700: >> Finally some numbers (best of 20 runs) that shows why it's worth all >> the hassle: >> >> git status | webkit linux-2.6 libreoffice-core gentoo-x86 >> -------------+---------------------------------------------- >> before | 1.097s 0.208s 0.399s 0.539s >> after | 0.736s 0.159s 0.248s 0.501s >> nr. patterns | 89 376 19 0 >> nr. tracked | 182k 40k 63k 101k > > Thanks for this work. I repeated some of the tests across NFS, > where I'd expect to see bigger differences. This is about reducing CPU processing time, not I/O time. So no bigger differences is expected. I/O time can be reduced with inotify, or fam in nfs case because inotify does not support nfs. > Best of 20 values reported in "min ...". > > webkit > Stock min 9.61 avg 11.61 +/- 1.35 max 14.26 > Duy min 6.91 avg 7.67 +/- 0.46 max 8.71 > > linux > Stock min 2.27 avg 3.16 +/- 0.56 max 4.49 > Duy min 2.04 avg 3.12 +/- 0.69 max 4.87 > > libreoffice-core > Stock min 4.56 avg 5.79 +/- 0.79 max 7.08 > Duy min 3.96 avg 5.25 +/- 0.95 max 6.95 > > Similar 30%-ish speedup on webkit. And an absolute gain > of 2.7 seconds is quite nice. > > -- Pete -- Duy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files 2013-02-17 4:39 ` Duy Nguyen @ 2013-02-17 15:49 ` Pete Wyckoff 0 siblings, 0 replies; 12+ messages in thread From: Pete Wyckoff @ 2013-02-17 15:49 UTC (permalink / raw) To: Duy Nguyen Cc: git, Junio C Hamano, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag pclouds@gmail.com wrote on Sun, 17 Feb 2013 11:39 +0700: > On Sun, Feb 17, 2013 at 1:11 AM, Pete Wyckoff <pw@padd.com> wrote: > > pclouds@gmail.com wrote on Sat, 16 Feb 2013 14:17 +0700: > >> Finally some numbers (best of 20 runs) that shows why it's worth all > >> the hassle: > >> > >> git status | webkit linux-2.6 libreoffice-core gentoo-x86 > >> -------------+---------------------------------------------- > >> before | 1.097s 0.208s 0.399s 0.539s > >> after | 0.736s 0.159s 0.248s 0.501s > >> nr. patterns | 89 376 19 0 > >> nr. tracked | 182k 40k 63k 101k > > > > Thanks for this work. I repeated some of the tests across NFS, > > where I'd expect to see bigger differences. > > This is about reducing CPU processing time, not I/O time. So no bigger > differences is expected. I/O time can be reduced with inotify, or fam > in nfs case because inotify does not support nfs. Numbers from the last mail were core.preloadindex=true. Here's "time" output from average runs: stock = 0m2.28s user 0m4.18s sys 0m11.28s elapsed 57.39 %CPU duy = 0m1.25s user 0m4.43s sys 0m7.45s elapsed 76.41 %CPU With this huge repo, preloadindex may be stressing directory cache behavior on the NFS server or client. Your patch helps both CPU and wait time by avoiding the 6000-odd open() of non-existent .gitignore. With core.preloadindex=false, it's a 1 sec speedup, all from CPU: stock = 0m2.18s user 0m1.59s sys 0m7.78s elapsed 48.45 %CPU duy = 0m1.17s user 0m1.63s sys 0m6.91s elapsed 40.59 %CPU -- Pete ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files 2013-02-16 7:17 ` [PATCH v2] " Nguyễn Thái Ngọc Duy 2013-02-16 18:11 ` Pete Wyckoff @ 2013-02-17 23:18 ` Junio C Hamano 2013-02-25 22:01 ` [PATCH/RFC] dir.c: Make git-status --ignored even more consistent Karsten Blees 2 siblings, 0 replies; 12+ messages in thread From: Junio C Hamano @ 2013-02-17 23:18 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy Cc: git, Karsten Blees, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes: > If path_handled is returned, contents goes up. And if check_only is > true, the loop could be broken early. These will not happen when > treat_one_path (and its wrapper treat_path) returns > path_ignored. dir_add_name internally does a cache_name_exists() check > so it makes no difference. > > To avoid this behavior change, treat_one_path is instructed to skip > the optimization when check_only or contents is used. OK, that makes more understandable why this is safe. > @@ -1242,9 +1246,23 @@ enum path_treatment { > static enum path_treatment treat_one_path(struct dir_struct *dir, > struct strbuf *path, > const struct path_simplify *simplify, > - int dtype, struct dirent *de) > + int dtype, struct dirent *de, > + int exclude_shortcut_ok) > { > ... > @@ -1331,18 +1347,29 @@ static int read_directory_recursive(struct dir_struct *dir, > goto out; > > while ((de = readdir(fdir)) != NULL) { > - switch (treat_path(dir, de, &path, baselen, simplify)) { > + switch (treat_path(dir, de, &path, baselen, > + simplify, > + !check_only && !contents)) { > ... Between these two places we may want to say what kind of short-cut we are talking about, but there only is one kind of short-cut at this moment, so let's leave that to other people who want to further optimize things in this codepath by adding other short-cuts. Thanks; will queue. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH/RFC] dir.c: Make git-status --ignored even more consistent 2013-02-16 7:17 ` [PATCH v2] " Nguyễn Thái Ngọc Duy 2013-02-16 18:11 ` Pete Wyckoff 2013-02-17 23:18 ` Junio C Hamano @ 2013-02-25 22:01 ` Karsten Blees 2 siblings, 0 replies; 12+ messages in thread From: Karsten Blees @ 2013-02-25 22:01 UTC (permalink / raw) To: Nguyễn Thái Ngọc Duy Cc: git, Junio C Hamano, kusmabite, Ramkumar Ramachandra, Robert Zeh, finnag, apelisse, peff The new "git-status --ignored" handling introduced with 721ac4ed "dir.c: Make git-status --ignored more consistent" and a45fb697 "status: always report ignored tracked directories" still has a few flaws in the "--untracked-files=normal" case: - It lists directories that match the exclude pattern, even if they are tracked, instead of the untracked files and directories that are _affected_ by the exclude pattern. This is inconsistent with the listing of untracked files and directories. Additionally, an entire (tracked) directory may be listed as ignored while contained files are listed as modified. - With an untracked directory between the ignored directory and files, the directory is dropped. - With a tracked directory between the ignored directory and files, both the directory and the individual files are listed as ignored. Change "git-status --ignored --untracked-files=normal" so that it no longer lists tracked directories. This is already in line with gitignore(5) and api-directory-listing.txt, so we don't need to update documentation. In the git-status case, always use is_path_excluded() instead of is_excluded(). The latter doesn't check if one of the parent directories is excluded, and we need the full picture when recursing into ignored directories. As is_path_excluded() is even more complex, only do this for untracked files and directories. Keep the original is_excluded() check in the DIR_COLLECT_IGNORED case so that git-add is not affected. In read_directory_recursive, pass along the check_only parameter when recursing into sub directories, so that we don't accidentally call dir_add_name in the check_only case. Signed-off-by: Karsten Blees <blees@dcon.de> --- See also: https://github.com/kblees/git/commits/kb/git-status-ignored Revisiting dir.c, I noticed that the added complexity in .gitignore processing was just recently introduced by these two changes: - a45fb697 status: always report ignored tracked directories - 721ac4ed dir.c: Make git-status --ignored more consistent Instead of skipping gitignore checks under very special 'safe' circumstances, this patch focuses on fixing pre-existing bugs first. The correctness of lazy gitignore checks is then obvious (I hope) by the very definition of gitignore(5): "gitignore - Specifies intentionally *untracked* files to ignore" [emphasis added]. There's still lots of room for improvement, e.g.: - prevent "git-status --ignored" from scanning everything twice - integrate struct path_exclude_check into struct dir_struct to save setup costs of is_path_excluded Cheers, Karsten dir.c | 138 ++++++++++++++++++++------------------------- t/t7061-wtstatus-ignore.sh | 65 +++++++++++++++++++-- wt-status.c | 2 +- 3 files changed, 123 insertions(+), 82 deletions(-) diff --git a/dir.c b/dir.c index 57394e4..1a5440f 100644 --- a/dir.c +++ b/dir.c @@ -884,6 +884,17 @@ int is_path_excluded(struct path_exclude_check *check, return 0; } +static int check_path_excluded(struct dir_struct *dir, + const char *name, int namelen, int *dtype) +{ + struct path_exclude_check check; + int excluded; + path_exclude_check_init(&check, dir); + excluded = is_path_excluded(&check, name, namelen, dtype); + path_exclude_check_clear(&check); + return excluded; +} + static struct dir_entry *dir_entry_new(const char *pathname, int len) { struct dir_entry *ent; @@ -897,8 +908,7 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len) static struct dir_entry *dir_add_name(struct dir_struct *dir, const char *pathname, int len) { - if (!(dir->flags & DIR_SHOW_IGNORED) && - cache_name_exists(pathname, len, ignore_case)) + if (cache_name_exists(pathname, len, ignore_case)) return NULL; ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc); @@ -1000,9 +1010,8 @@ static enum exist_status directory_exists_in_index(const char *dirname, int len) * traversal routine. * * Case 1: If we *already* have entries in the index under that - * directory name, we recurse into the directory to see all the files, - * unless the directory is excluded and we want to show ignored - * directories + * directory name, we always recurse into the directory to see + * all the files. * * Case 2: If we *already* have that directory name as a gitlink, * we always continue to see it as a gitlink, regardless of whether @@ -1016,9 +1025,9 @@ static enum exist_status directory_exists_in_index(const char *dirname, int len) * just a directory, unless "hide_empty_directories" is * also true and the directory is empty, in which case * we just ignore it entirely. - * if we are looking for ignored directories, look if it - * contains only ignored files to decide if it must be shown as - * ignored or not. + * if we are looking for ignored directories, we also hide + * directories with untracked files (i.e. that are already + * listed in the untracked section). * (b) if it looks like a git directory, and we don't have * 'no_gitlinks' set we treat it as a gitlink, and show it * as a directory. @@ -1031,15 +1040,13 @@ enum directory_treatment { }; static enum directory_treatment treat_directory(struct dir_struct *dir, - const char *dirname, int len, int exclude, - const struct path_simplify *simplify) + const char *dirname, int len, const struct path_simplify *simplify) { + int dt = DT_DIR, exclude; + /* The "len-1" is to strip the final '/' */ switch (directory_exists_in_index(dirname, len-1)) { case index_directory: - if ((dir->flags & DIR_SHOW_OTHER_DIRECTORIES) && exclude) - break; - return recurse_into_directory; case index_gitdir: @@ -1060,65 +1067,61 @@ static enum directory_treatment treat_directory(struct dir_struct *dir, /* This is the "show_other_directories" case */ + exclude = check_path_excluded(dir, dirname, len-1, &dt); + /* * We are looking for ignored files and our directory is not ignored, * check if it contains only ignored files */ if ((dir->flags & DIR_SHOW_IGNORED) && !exclude) { - int ignored; - dir->flags &= ~DIR_SHOW_IGNORED; - dir->flags |= DIR_HIDE_EMPTY_DIRECTORIES; - ignored = read_directory_recursive(dir, dirname, len, 1, simplify); - dir->flags &= ~DIR_HIDE_EMPTY_DIRECTORIES; - dir->flags |= DIR_SHOW_IGNORED; - - return ignored ? ignore_directory : show_directory; + int untracked; + dir->flags &= ~(DIR_SHOW_IGNORED|DIR_SHOW_OTHER_DIRECTORIES); + untracked = read_directory_recursive(dir, dirname, len, 1, simplify); + dir->flags |= (DIR_SHOW_IGNORED|DIR_SHOW_OTHER_DIRECTORIES); + + /* + * Don't list the directory as ignored if it is already listed + * as untracked. + */ + if (untracked) + return ignore_directory; } - if (!(dir->flags & DIR_SHOW_IGNORED) && - !(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) + + if (!(dir->flags & DIR_SHOW_IGNORED) && exclude) + return ignore_directory; + + if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) return show_directory; if (!read_directory_recursive(dir, dirname, len, 1, simplify)) return ignore_directory; return show_directory; } +enum path_treatment { + path_ignored, + path_handled, + path_recurse +}; + /* * Decide what to do when we find a file while traversing the - * filesystem. Mostly two cases: - * - * 1. We are looking for ignored files - * (a) File is ignored, include it - * (b) File is in ignored path, include it - * (c) File is not ignored, exclude it + * filesystem. * - * 2. Other scenarios, include the file if not excluded + * Case 1: The file is in the index: always ignore. * - * Return 1 for exclude, 0 for include. + * Case 2: The file is not in the index: check if the file is excluded. */ -static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude, int *dtype) +static enum path_treatment treat_file(struct dir_struct *dir, + struct strbuf *path, const struct path_simplify *simplify, + int *dtype) { - struct path_exclude_check check; - int exclude_file = 0; - - if (exclude) - exclude_file = !(dir->flags & DIR_SHOW_IGNORED); - else if (dir->flags & DIR_SHOW_IGNORED) { - /* Always exclude indexed files */ - struct cache_entry *ce = index_name_exists(&the_index, - path->buf, path->len, ignore_case); - - if (ce) - return 1; - - path_exclude_check_init(&check, dir); - - if (!is_path_excluded(&check, path->buf, path->len, dtype)) - exclude_file = 1; - - path_exclude_check_clear(&check); - } + if (cache_name_exists(path->buf, path->len, ignore_case)) + return path_ignored; - return exclude_file; + if (check_path_excluded(dir, path->buf, path->len, dtype) != + !!(dir->flags & DIR_SHOW_IGNORED)) + return path_ignored; + return path_handled; } /* @@ -1233,29 +1236,16 @@ static int get_dtype(struct dirent *de, const char *path, int len) return dtype; } -enum path_treatment { - path_ignored, - path_handled, - path_recurse -}; - static enum path_treatment treat_one_path(struct dir_struct *dir, struct strbuf *path, const struct path_simplify *simplify, int dtype, struct dirent *de) { - int exclude = is_excluded(dir, path->buf, &dtype); - if (exclude && (dir->flags & DIR_COLLECT_IGNORED) + if ((dir->flags & DIR_COLLECT_IGNORED) + && is_excluded(dir, path->buf, &dtype) && exclude_matches_pathspec(path->buf, path->len, simplify)) dir_add_ignored(dir, path->buf, path->len); - /* - * Excluded? If we don't explicitly want to show - * ignored files, ignore it - */ - if (exclude && !(dir->flags & DIR_SHOW_IGNORED)) - return path_ignored; - if (dtype == DT_UNKNOWN) dtype = get_dtype(de, path->buf, path->len); @@ -1265,7 +1255,7 @@ static enum path_treatment treat_one_path(struct dir_struct *dir, case DT_DIR: strbuf_addch(path, '/'); - switch (treat_directory(dir, path->buf, path->len, exclude, simplify)) { + switch (treat_directory(dir, path->buf, path->len, simplify)) { case show_directory: break; case recurse_into_directory: @@ -1276,12 +1266,7 @@ static enum path_treatment treat_one_path(struct dir_struct *dir, break; case DT_REG: case DT_LNK: - switch (treat_file(dir, path, exclude, &dtype)) { - case 1: - return path_ignored; - default: - break; - } + return treat_file(dir, path, simplify, &dtype); } return path_handled; } @@ -1334,8 +1319,7 @@ static int read_directory_recursive(struct dir_struct *dir, switch (treat_path(dir, de, &path, baselen, simplify)) { case path_recurse: contents += read_directory_recursive(dir, path.buf, - path.len, 0, - simplify); + path.len, check_only, simplify); continue; case path_ignored: continue; diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh index 0da1214..7a73448 100755 --- a/t/t7061-wtstatus-ignore.sh +++ b/t/t7061-wtstatus-ignore.sh @@ -96,7 +96,7 @@ cat >expected <<\EOF ?? expected EOF -test_expect_success 'status ignored tracked directory with --ignore' ' +test_expect_success 'status tracked directory with --ignore' ' rm -rf untracked-ignored && mkdir tracked && : >tracked/committed && @@ -113,7 +113,7 @@ cat >expected <<\EOF ?? expected EOF -test_expect_success 'status ignored tracked directory with --ignore -u' ' +test_expect_success 'status tracked directory with --ignore -u' ' git status --porcelain --ignored -u >actual && test_cmp expected actual ' @@ -122,10 +122,10 @@ cat >expected <<\EOF ?? .gitignore ?? actual ?? expected -!! tracked/ +!! tracked/uncommitted EOF -test_expect_success 'status ignored tracked directory and uncommitted file with --ignore' ' +test_expect_success 'status tracked directory and uncommitted file with --ignore' ' : >tracked/uncommitted && git status --porcelain --ignored >actual && test_cmp expected actual @@ -138,6 +138,63 @@ cat >expected <<\EOF !! tracked/uncommitted EOF +test_expect_success 'status tracked directory and uncommitted file with --ignore -u' ' + git status --porcelain --ignored -u >actual && + test_cmp expected actual +' + +cat >expected <<\EOF +?? .gitignore +?? actual +?? expected +!! tracked/untracked/ +EOF + +test_expect_success 'status ignored untracked directory and uncommitted file with --ignore' ' + rm -rf tracked/uncommitted && + mkdir tracked/untracked && + : >tracked/untracked/uncommitted && + git status --porcelain --ignored >actual && + test_cmp expected actual +' + +cat >expected <<\EOF +?? .gitignore +?? actual +?? expected +!! tracked/untracked/uncommitted +EOF + +test_expect_success 'status ignored untracked directory and uncommitted file with --ignore -u' ' + git status --porcelain --ignored -u >actual && + test_cmp expected actual +' + +cat >expected <<\EOF +?? .gitignore +?? actual +?? expected +!! tracked/tracked/uncommitted +EOF + +test_expect_success 'status ignored tracked directory and uncommitted file with --ignore' ' + rm -rf tracked/untracked && + mkdir tracked/tracked && + : >tracked/tracked/committed && + : >tracked/tracked/uncommitted && + git add -f tracked/tracked/committed && + git commit -m. && + git status --porcelain --ignored >actual && + test_cmp expected actual +' + +cat >expected <<\EOF +?? .gitignore +?? actual +?? expected +!! tracked/tracked/uncommitted +EOF + test_expect_success 'status ignored tracked directory and uncommitted file with --ignore -u' ' git status --porcelain --ignored -u >actual && test_cmp expected actual diff --git a/wt-status.c b/wt-status.c index ef405d0..79eb124 100644 --- a/wt-status.c +++ b/wt-status.c @@ -518,7 +518,7 @@ static void wt_status_collect_untracked(struct wt_status *s) dir.nr = 0; dir.flags = DIR_SHOW_IGNORED; if (s->show_untracked_files != SHOW_ALL_UNTRACKED_FILES) - dir.flags |= DIR_SHOW_OTHER_DIRECTORIES; + dir.flags |= DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; fill_directory(&dir, s->pathspec); for (i = 0; i < dir.nr; i++) { struct dir_entry *ent = dir.entries[i]; -- 1.8.1.2.7988.ge3e3ca2 ^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-02-25 22:01 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-02-15 14:17 [PATCH] read_directory: avoid invoking exclude machinery on tracked files Nguyễn Thái Ngọc Duy 2013-02-15 16:52 ` Junio C Hamano 2013-02-15 18:30 ` Duy Nguyen 2013-02-15 19:32 ` Junio C Hamano 2013-02-16 3:31 ` Duy Nguyen 2013-02-18 16:42 ` Karsten Blees 2013-02-16 7:17 ` [PATCH v2] " Nguyễn Thái Ngọc Duy 2013-02-16 18:11 ` Pete Wyckoff 2013-02-17 4:39 ` Duy Nguyen 2013-02-17 15:49 ` Pete Wyckoff 2013-02-17 23:18 ` Junio C Hamano 2013-02-25 22:01 ` [PATCH/RFC] dir.c: Make git-status --ignored even more consistent Karsten Blees
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).