* [PATCH 1/3] Unindent excluded_from_list()
2012-05-26 12:31 [PATCH WIP 0/3] top-level gitignore considered less harmful Nguyễn Thái Ngọc Duy
@ 2012-05-26 12:31 ` Nguyễn Thái Ngọc Duy
2012-05-26 12:31 ` [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch Nguyễn Thái Ngọc Duy
` (2 subsequent siblings)
3 siblings, 0 replies; 14+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-05-26 12:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Nguyễn Thái Ngọc Duy
Return early if el->nr == 0. Unindent one more level for FNM_PATHNAME
code block as this block is getting complex and may need more
indentation.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
dir.c | 96 +++++++++++++++++++++++++++++++++----------------------------------
1 file changed, 48 insertions(+), 48 deletions(-)
diff --git a/dir.c b/dir.c
index b65d37c..8535cf2 100644
--- a/dir.c
+++ b/dir.c
@@ -508,56 +508,56 @@ int excluded_from_list(const char *pathname,
{
int i;
- if (el->nr) {
- for (i = el->nr - 1; 0 <= i; i--) {
- struct exclude *x = el->excludes[i];
- const char *exclude = x->pattern;
- int to_exclude = x->to_exclude;
-
- if (x->flags & EXC_FLAG_MUSTBEDIR) {
- if (*dtype == DT_UNKNOWN)
- *dtype = get_dtype(NULL, pathname, pathlen);
- if (*dtype != DT_DIR)
- continue;
- }
+ if (!el->nr)
+ return -1; /* undefined */
+
+ for (i = el->nr - 1; 0 <= i; i--) {
+ struct exclude *x = el->excludes[i];
+ const char *exclude = x->pattern;
+ int to_exclude = x->to_exclude;
+
+ if (x->flags & EXC_FLAG_MUSTBEDIR) {
+ if (*dtype == DT_UNKNOWN)
+ *dtype = get_dtype(NULL, pathname, pathlen);
+ if (*dtype != DT_DIR)
+ continue;
+ }
- if (x->flags & EXC_FLAG_NODIR) {
- /* match basename */
- if (x->flags & EXC_FLAG_NOWILDCARD) {
- if (!strcmp_icase(exclude, basename))
- return to_exclude;
- } else if (x->flags & EXC_FLAG_ENDSWITH) {
- if (x->patternlen - 1 <= pathlen &&
- !strcmp_icase(exclude + 1, pathname + pathlen - x->patternlen + 1))
- return to_exclude;
- } else {
- if (fnmatch_icase(exclude, basename, 0) == 0)
- return to_exclude;
- }
- }
- else {
- /* match with FNM_PATHNAME:
- * exclude has base (baselen long) implicitly
- * in front of it.
- */
- int baselen = x->baselen;
- if (*exclude == '/')
- exclude++;
-
- if (pathlen < baselen ||
- (baselen && pathname[baselen-1] != '/') ||
- strncmp_icase(pathname, x->base, baselen))
- continue;
-
- if (x->flags & EXC_FLAG_NOWILDCARD) {
- if (!strcmp_icase(exclude, pathname + baselen))
- return to_exclude;
- } else {
- if (fnmatch_icase(exclude, pathname+baselen,
- FNM_PATHNAME) == 0)
- return to_exclude;
- }
+ if (x->flags & EXC_FLAG_NODIR) {
+ /* match basename */
+ if (x->flags & EXC_FLAG_NOWILDCARD) {
+ if (!strcmp_icase(exclude, basename))
+ return to_exclude;
+ } else if (x->flags & EXC_FLAG_ENDSWITH) {
+ if (x->patternlen - 1 <= pathlen &&
+ !strcmp_icase(exclude + 1, pathname + pathlen - x->patternlen + 1))
+ return to_exclude;
+ } else {
+ if (fnmatch_icase(exclude, basename, 0) == 0)
+ return to_exclude;
}
+ continue;
+ }
+
+
+ /* match with FNM_PATHNAME:
+ * exclude has base (baselen long) implicitly in front of it.
+ */
+ if (*exclude == '/')
+ exclude++;
+
+ if (pathlen < x->baselen ||
+ (x->baselen && pathname[x->baselen-1] != '/') ||
+ strncmp_icase(pathname, x->base, x->baselen))
+ continue;
+
+ if (x->flags & EXC_FLAG_NOWILDCARD) {
+ if (!strcmp_icase(exclude, pathname + x->baselen))
+ return to_exclude;
+ } else {
+ if (fnmatch_icase(exclude, pathname+x->baselen,
+ FNM_PATHNAME) == 0)
+ return to_exclude;
}
}
return -1; /* undecided */
--
1.7.10.2.549.g9354186
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch
2012-05-26 12:31 [PATCH WIP 0/3] top-level gitignore considered less harmful Nguyễn Thái Ngọc Duy
2012-05-26 12:31 ` [PATCH 1/3] Unindent excluded_from_list() Nguyễn Thái Ngọc Duy
@ 2012-05-26 12:31 ` Nguyễn Thái Ngọc Duy
2012-05-27 6:51 ` Junio C Hamano
2012-05-29 18:03 ` Junio C Hamano
2012-05-26 12:31 ` [PATCH 3/3] exclude: reduce computation cost on checking dirname in patterns Nguyễn Thái Ngọc Duy
2012-05-26 13:25 ` [PATCH WIP 0/3] top-level gitignore considered less harmful Nguyen Thai Ngoc Duy
3 siblings, 2 replies; 14+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-05-26 12:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Nguyễn Thái Ngọc Duy
this also avoids calling fnmatch() if the non-wildcard prefix is
longer than basename
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
dir.c | 41 +++++++++++++++++++++++++++--------------
dir.h | 2 +-
2 files changed, 28 insertions(+), 15 deletions(-)
diff --git a/dir.c b/dir.c
index 8535cf2..50d744f 100644
--- a/dir.c
+++ b/dir.c
@@ -295,9 +295,11 @@ int match_pathspec_depth(const struct pathspec *ps,
return retval;
}
+const char *wildcards = "*?[{\\";
+
static int no_wildcard(const char *string)
{
- return string[strcspn(string, "*?[{\\")] == '\0';
+ return string[strcspn(string, wildcards)] == '\0';
}
void add_exclude(const char *string, const char *base,
@@ -336,8 +338,7 @@ void add_exclude(const char *string, const char *base,
x->flags = flags;
if (!strchr(string, '/'))
x->flags |= EXC_FLAG_NODIR;
- if (no_wildcard(string))
- x->flags |= EXC_FLAG_NOWILDCARD;
+ x->nowildcardlen = strcspn(string, wildcards);
if (*string == '*' && no_wildcard(string+1))
x->flags |= EXC_FLAG_ENDSWITH;
ALLOC_GROW(which->excludes, which->nr + 1, which->alloc);
@@ -513,8 +514,9 @@ int excluded_from_list(const char *pathname,
for (i = el->nr - 1; 0 <= i; i--) {
struct exclude *x = el->excludes[i];
- const char *exclude = x->pattern;
+ const char *name, *exclude = x->pattern;
int to_exclude = x->to_exclude;
+ int namelen, prefix = x->nowildcardlen;
if (x->flags & EXC_FLAG_MUSTBEDIR) {
if (*dtype == DT_UNKNOWN)
@@ -525,7 +527,7 @@ int excluded_from_list(const char *pathname,
if (x->flags & EXC_FLAG_NODIR) {
/* match basename */
- if (x->flags & EXC_FLAG_NOWILDCARD) {
+ if (prefix == x->patternlen) {
if (!strcmp_icase(exclude, basename))
return to_exclude;
} else if (x->flags & EXC_FLAG_ENDSWITH) {
@@ -539,26 +541,37 @@ int excluded_from_list(const char *pathname,
continue;
}
-
/* match with FNM_PATHNAME:
* exclude has base (baselen long) implicitly in front of it.
*/
- if (*exclude == '/')
+ if (*exclude == '/') {
exclude++;
+ prefix--;
+ }
if (pathlen < x->baselen ||
(x->baselen && pathname[x->baselen-1] != '/') ||
strncmp_icase(pathname, x->base, x->baselen))
continue;
- if (x->flags & EXC_FLAG_NOWILDCARD) {
- if (!strcmp_icase(exclude, pathname + x->baselen))
- return to_exclude;
- } else {
- if (fnmatch_icase(exclude, pathname+x->baselen,
- FNM_PATHNAME) == 0)
- return to_exclude;
+ namelen = x->baselen ? pathlen - x->baselen : pathlen;
+ name = pathname + pathlen - namelen;
+
+ /* if the non-wildcard part is longer than the
+ remaining pathname, surely it cannot match */
+ if (prefix > namelen)
+ continue;
+
+ if (prefix) {
+ if (strncmp_icase(exclude, name, prefix))
+ continue;
+ exclude += prefix;
+ name += prefix;
+ namelen -= prefix;
}
+
+ if (!namelen || !fnmatch_icase(exclude, name, FNM_PATHNAME))
+ return to_exclude;
}
return -1; /* undecided */
}
diff --git a/dir.h b/dir.h
index 58b6fc7..39fc145 100644
--- a/dir.h
+++ b/dir.h
@@ -7,7 +7,6 @@ struct dir_entry {
};
#define EXC_FLAG_NODIR 1
-#define EXC_FLAG_NOWILDCARD 2
#define EXC_FLAG_ENDSWITH 4
#define EXC_FLAG_MUSTBEDIR 8
@@ -17,6 +16,7 @@ struct exclude_list {
struct exclude {
const char *pattern;
int patternlen;
+ int nowildcardlen;
const char *base;
int baselen;
int to_exclude;
--
1.7.10.2.549.g9354186
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch
2012-05-26 12:31 ` [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch Nguyễn Thái Ngọc Duy
@ 2012-05-27 6:51 ` Junio C Hamano
2012-05-27 12:06 ` Nguyen Thai Ngoc Duy
2012-05-29 18:03 ` Junio C Hamano
1 sibling, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2012-05-27 6:51 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> this also avoids calling fnmatch() if the non-wildcard prefix is
> longer than basename
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
I have been wondering if you can take a different approach based on the
same observation this patch is based on. If you see an entry /foo/bar/*.c
in the top-level .gitignore, perhaps you can set it aside in a different
part of "struct exclude" for the top-level directory (because the pattern
will never match outside foo/bar directory), so that it is not even used
for matching, and only when you descend to foo/bar directory, add "/*.c"
to the "struct exclude" you create for that directory.
That way, instead of "strcmp is faster than fnmatch, but we always compare
all elements in the huge pattern list given at the toplevel", you would be
doing "we do not even bother to compare with the elements we know do not
matter", which would be far more efficient, no?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch
2012-05-27 6:51 ` Junio C Hamano
@ 2012-05-27 12:06 ` Nguyen Thai Ngoc Duy
2012-05-27 18:14 ` Junio C Hamano
0 siblings, 1 reply; 14+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-05-27 12:06 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jeff King
On Sun, May 27, 2012 at 1:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> this also avoids calling fnmatch() if the non-wildcard prefix is
>> longer than basename
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>
> I have been wondering if you can take a different approach based on the
> same observation this patch is based on. If you see an entry /foo/bar/*.c
> in the top-level .gitignore, perhaps you can set it aside in a different
> part of "struct exclude" for the top-level directory (because the pattern
> will never match outside foo/bar directory), so that it is not even used
> for matching, and only when you descend to foo/bar directory, add "/*.c"
> to the "struct exclude" you create for that directory.
that part is "base" field in "struct exclude", I believe.
> That way, instead of "strcmp is faster than fnmatch, but we always compare
> all elements in the huge pattern list given at the toplevel", you would be
> doing "we do not even bother to compare with the elements we know do not
> matter", which would be far more efficient, no?
You still have to do at least one strncmp on "base" though to know if
a pattern is applicable to the given directory. So it's not really
cheaper than what is done in 3/3.
--
Duy
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch
2012-05-27 12:06 ` Nguyen Thai Ngoc Duy
@ 2012-05-27 18:14 ` Junio C Hamano
2012-05-28 1:03 ` Nguyen Thai Ngoc Duy
2012-05-28 5:02 ` Junio C Hamano
0 siblings, 2 replies; 14+ messages in thread
From: Junio C Hamano @ 2012-05-27 18:14 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: git, Jeff King
Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>> I have been wondering if you can take a different approach based on the
>> same observation this patch is based on. If you see an entry /foo/bar/*.c
>> in the top-level .gitignore, perhaps you can set it aside in a different
>> part of "struct exclude" for the top-level directory (because the pattern
>> will never match outside foo/bar directory), so that it is not even used
>> for matching, and only when you descend to foo/bar directory, add "/*.c"
>> to the "struct exclude" you create for that directory.
>
> that part is "base" field in "struct exclude", I believe.
Sorry, I misspoke; it is not about struct exclude at all.
>> That way, instead of "strcmp is faster than fnmatch, but we always compare
>> all elements in the huge pattern list given at the toplevel", you would be
>> doing "we do not even bother to compare with the elements we know do not
>> matter", which would be far more efficient, no?
>
> You still have to do at least one strncmp on "base" though to know if
> a pattern is applicable to the given directory. So it's not really
> cheaper than what is done in 3/3.
Actually I was referring to the exclude_stack.
Suppose you have .gitignore file at the top that lists /foo/bar/*.c
(among other millions of patterns anchored to specific directory),
and another in the foo/bar directory. When you are looking at a
path in the top-level, currently the exclude_stack would have one
element, per-directory one for .gitignore at the top, that has
millions of patterns that would never match. And then when you
descend into foo/bar directory, prep_exclude would link two elements
(one for foo/ directory which may be empty, another for foo/bar
directory) to this, and then you check paths you see in foo/bar
directory using all the elements that appear in the exclude_stack.
What I was suggesting was that you could choose not to add
/foo/bar/*.c entry in the exclude_stack element for the top-level
(but remember you did so), and then inside prep_exclude() when you
look at different directory, e.g. foo/bar, notice that higher level
(i.e. toplevel in this example) has such a deferred patterns that
applies to the new directory. Then instead of adding /foo/bar/*.c
at the top-level, you can pretend as if /*.c appeared in .gitignore
file in the deeper level in the hierarchy.
And this does not happen per path you check; exclude_stack used by
excluded() is designed to take advantage of the access pattern that
we tend to check paths from the same directory together, so such an
adjustment will be per directory switching (i.e. it will be part of
the prep_exclude() overhead that is amortized over paths you walk).
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch
2012-05-27 18:14 ` Junio C Hamano
@ 2012-05-28 1:03 ` Nguyen Thai Ngoc Duy
2012-05-28 5:02 ` Junio C Hamano
1 sibling, 0 replies; 14+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-05-28 1:03 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jeff King
On Mon, May 28, 2012 at 1:14 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Actually I was referring to the exclude_stack.
>
> Suppose you have .gitignore file at the top that lists /foo/bar/*.c
> (among other millions of patterns anchored to specific directory),
> and another in the foo/bar directory. When you are looking at a
> path in the top-level, currently the exclude_stack would have one
> element, per-directory one for .gitignore at the top, that has
> millions of patterns that would never match. And then when you
> descend into foo/bar directory, prep_exclude would link two elements
> (one for foo/ directory which may be empty, another for foo/bar
> directory) to this, and then you check paths you see in foo/bar
> directory using all the elements that appear in the exclude_stack.
>
> What I was suggesting was that you could choose not to add
> /foo/bar/*.c entry in the exclude_stack element for the top-level
> (but remember you did so), and then inside prep_exclude() when you
> look at different directory, e.g. foo/bar, notice that higher level
> (i.e. toplevel in this example) has such a deferred patterns that
> applies to the new directory. Then instead of adding /foo/bar/*.c
> at the top-level, you can pretend as if /*.c appeared in .gitignore
> file in the deeper level in the hierarchy.
>
> And this does not happen per path you check; exclude_stack used by
> excluded() is designed to take advantage of the access pattern that
> we tend to check paths from the same directory together, so such an
> adjustment will be per directory switching (i.e. it will be part of
> the prep_exclude() overhead that is amortized over paths you walk).
That's perhaps a better approach. Two points (for myself to think
again after work):
- that involves reordering the stack to make sure we always have
gitignore "files" in root dir first, then 1st level... after we split
some patterns from top-level file to deeper level, so that popping
works without major change
- the implication due to pattern reordering (probably keep the order
as we currently have. Suppose we checking in "sub", then sub/...
patterns from top-level go first, then from sub/.gitignore)
--
Duy
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch
2012-05-27 18:14 ` Junio C Hamano
2012-05-28 1:03 ` Nguyen Thai Ngoc Duy
@ 2012-05-28 5:02 ` Junio C Hamano
1 sibling, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2012-05-28 5:02 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: git, Jeff King
Junio C Hamano <gitster@pobox.com> writes:
> And this does not happen per path you check; exclude_stack used by
> excluded() is designed to take advantage of the access pattern that
> we tend to check paths from the same directory together, so such an
> adjustment will be per directory switching (i.e. it will be part of
> the prep_exclude() overhead that is amortized over paths you walk).
Just in case it wasn't clear, I didn't mean to say "mine is the
right way to do so; I will reject your patch that doesn't do so".
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch
2012-05-26 12:31 ` [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch Nguyễn Thái Ngọc Duy
2012-05-27 6:51 ` Junio C Hamano
@ 2012-05-29 18:03 ` Junio C Hamano
2012-05-29 18:21 ` Thiago Farina
1 sibling, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2012-05-29 18:03 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> this also avoids calling fnmatch() if the non-wildcard prefix is
> longer than basename
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> dir.c | 41 +++++++++++++++++++++++++++--------------
> dir.h | 2 +-
> 2 files changed, 28 insertions(+), 15 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index 8535cf2..50d744f 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -295,9 +295,11 @@ int match_pathspec_depth(const struct pathspec *ps,
> return retval;
> }
>
> +const char *wildcards = "*?[{\\";
Elsewhere in this file, the logic to notice the non-wildcard part of
the pathspec uses is_glob_special(). Shouldn't the new code that
use this do the same?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch
2012-05-29 18:03 ` Junio C Hamano
@ 2012-05-29 18:21 ` Thiago Farina
0 siblings, 0 replies; 14+ messages in thread
From: Thiago Farina @ 2012-05-29 18:21 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc, git, Jeff King
On Tue, May 29, 2012 at 3:03 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
>
>> this also avoids calling fnmatch() if the non-wildcard prefix is
>> longer than basename
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
>> ---
>> dir.c | 41 +++++++++++++++++++++++++++--------------
>> dir.h | 2 +-
>> 2 files changed, 28 insertions(+), 15 deletions(-)
>>
>> diff --git a/dir.c b/dir.c
>> index 8535cf2..50d744f 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -295,9 +295,11 @@ int match_pathspec_depth(const struct pathspec *ps,
>> return retval;
>> }
>>
>> +const char *wildcards = "*?[{\\";
>
nit: can this be const char wildcards[] = "..."; ?
also an unrelated question, is there a style guide for naming
constants like this? In chromium project we write them like kFoo.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/3] exclude: reduce computation cost on checking dirname in patterns
2012-05-26 12:31 [PATCH WIP 0/3] top-level gitignore considered less harmful Nguyễn Thái Ngọc Duy
2012-05-26 12:31 ` [PATCH 1/3] Unindent excluded_from_list() Nguyễn Thái Ngọc Duy
2012-05-26 12:31 ` [PATCH 2/3] exclude: do strcmp as much as possible before fnmatch Nguyễn Thái Ngọc Duy
@ 2012-05-26 12:31 ` Nguyễn Thái Ngọc Duy
2012-05-26 13:25 ` [PATCH WIP 0/3] top-level gitignore considered less harmful Nguyen Thai Ngoc Duy
3 siblings, 0 replies; 14+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-05-26 12:31 UTC (permalink / raw)
To: git; +Cc: Jeff King, Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
dir.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
dir.h | 3 +++
2 files changed, 57 insertions(+), 1 deletion(-)
diff --git a/dir.c b/dir.c
index 50d744f..ff5e2d9 100644
--- a/dir.c
+++ b/dir.c
@@ -507,7 +507,7 @@ int excluded_from_list(const char *pathname,
int pathlen, const char *basename, int *dtype,
struct exclude_list *el)
{
- int i;
+ int i, baselen = pathlen - (basename - pathname);
if (!el->nr)
return -1; /* undefined */
@@ -562,6 +562,35 @@ int excluded_from_list(const char *pathname,
if (prefix > namelen)
continue;
+ /*
+ * it's supposed that the caller throws a series of pathnames of
+ * the same dirname to this function when el->pruning != 0.
+ *
+ * If we could check whether a pattern matches dirname, we could
+ * save the result and reuse for next pathnames. The caller
+ * must reset pruned/dir_matched bits when it moves to a
+ * different directory.
+ */
+ if (el->samedir && prefix >= namelen - baselen) {
+ int matched;
+ if (x->flags & EXC_FLAG_DIR_MATCH_VALID)
+ matched = x->flags & EXC_FLAG_DIR_MATCHED;
+ else {
+ matched = !strncmp_icase(exclude, name, namelen - baselen);
+ if (matched)
+ x->flags |= EXC_FLAG_DIR_MATCHED;
+ x->flags |= EXC_FLAG_DIR_MATCH_VALID;
+ }
+
+ if (!matched)
+ continue;
+
+ prefix -= namelen - baselen;
+ exclude += namelen - baselen;
+ name = basename;
+ namelen = baselen;
+ }
+
if (prefix) {
if (strncmp_icase(exclude, name, prefix))
continue;
@@ -576,6 +605,28 @@ int excluded_from_list(const char *pathname,
return -1; /* undecided */
}
+static void prep_exclude_read_directory(struct dir_struct *dir,
+ const struct strbuf *path)
+{
+ int i, st;
+ prep_exclude(dir, path->buf, path->len);
+ for (st = EXC_CMDL; st <= EXC_FILE; st++) {
+ struct exclude_list *el = dir->exclude_list + st;
+ el->samedir = 1;
+ for (i = 0; i < el->nr; i++)
+ el->excludes[i]->flags &= ~EXC_FLAG_DIR_MATCH_VALID;
+ }
+}
+
+static void cleanup_exclude_read_directory(struct dir_struct *dir)
+{
+ int st;
+ for (st = EXC_CMDL; st <= EXC_FILE; st++) {
+ struct exclude_list *el = dir->exclude_list + st;
+ el->samedir = 0;
+ }
+}
+
int excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
{
int pathlen = strlen(pathname);
@@ -985,6 +1036,7 @@ static int read_directory_recursive(struct dir_struct *dir,
return 0;
strbuf_add(&path, base, baselen);
+ prep_exclude_read_directory(dir, &path);
while ((de = readdir(fdir)) != NULL) {
switch (treat_path(dir, de, &path, baselen, simplify)) {
@@ -1005,6 +1057,7 @@ static int read_directory_recursive(struct dir_struct *dir,
dir_add_name(dir, path.buf, path.len);
}
exit_early:
+ cleanup_exclude_read_directory(dir);
closedir(fdir);
strbuf_release(&path);
diff --git a/dir.h b/dir.h
index 39fc145..003daf4 100644
--- a/dir.h
+++ b/dir.h
@@ -7,12 +7,15 @@ struct dir_entry {
};
#define EXC_FLAG_NODIR 1
+#define EXC_FLAG_DIR_MATCH_VALID 2
#define EXC_FLAG_ENDSWITH 4
#define EXC_FLAG_MUSTBEDIR 8
+#define EXC_FLAG_DIR_MATCHED 16
struct exclude_list {
int nr;
int alloc;
+ int samedir;
struct exclude {
const char *pattern;
int patternlen;
--
1.7.10.2.549.g9354186
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH WIP 0/3] top-level gitignore considered less harmful
2012-05-26 12:31 [PATCH WIP 0/3] top-level gitignore considered less harmful Nguyễn Thái Ngọc Duy
` (2 preceding siblings ...)
2012-05-26 12:31 ` [PATCH 3/3] exclude: reduce computation cost on checking dirname in patterns Nguyễn Thái Ngọc Duy
@ 2012-05-26 13:25 ` Nguyen Thai Ngoc Duy
2012-05-26 21:45 ` Jeff King
3 siblings, 1 reply; 14+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-05-26 13:25 UTC (permalink / raw)
To: git; +Cc: Jeff King, Nguyễn Thái Ngọc Duy
On Sat, May 26, 2012 at 7:31 PM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> The result is not so impressive (i'm on -O0 though). Old webkit.git,
> before:
(it's "git status" by the way)
>
> real 0m6.418s
> user 0m5.561s
> sys 0m0.827s
>
> after:
>
> real 0m5.262s
> user 0m4.407s
> sys 0m0.850s
and with your patch to redistribute .gitignore in webkit, so top-level
is small again:
real 0m4.185s
user 0m3.271s
sys 0m0.905s
so my numbers look "ok".
--
Duy
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH WIP 0/3] top-level gitignore considered less harmful
2012-05-26 13:25 ` [PATCH WIP 0/3] top-level gitignore considered less harmful Nguyen Thai Ngoc Duy
@ 2012-05-26 21:45 ` Jeff King
2012-05-27 3:45 ` Nguyen Thai Ngoc Duy
0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2012-05-26 21:45 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: git
On Sat, May 26, 2012 at 08:25:54PM +0700, Nguyen Thai Ngoc Duy wrote:
> On Sat, May 26, 2012 at 7:31 PM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> > The result is not so impressive (i'm on -O0 though). Old webkit.git,
> > before:
>
> (it's "git status" by the way)
>
> >
> > real 0m6.418s
> > user 0m5.561s
> > sys 0m0.827s
> >
> > after:
> >
> > real 0m5.262s
> > user 0m4.407s
> > sys 0m0.850s
>
> and with your patch to redistribute .gitignore in webkit, so top-level
> is small again:
>
> real 0m4.185s
> user 0m3.271s
> sys 0m0.905s
>
> so my numbers look "ok".
Is that last number just with the redistribution, or with the
redistribution _and_ your patch? I'd like to see both to see whether it
is the case that the two optimizations work together for combined
benefit, or if doing one makes the other one pointless.
-Peff
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH WIP 0/3] top-level gitignore considered less harmful
2012-05-26 21:45 ` Jeff King
@ 2012-05-27 3:45 ` Nguyen Thai Ngoc Duy
0 siblings, 0 replies; 14+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-05-27 3:45 UTC (permalink / raw)
To: Jeff King; +Cc: git
On Sun, May 27, 2012 at 4:45 AM, Jeff King <peff@peff.net> wrote:
> On Sat, May 26, 2012 at 08:25:54PM +0700, Nguyen Thai Ngoc Duy wrote:
>
>> On Sat, May 26, 2012 at 7:31 PM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
>> > The result is not so impressive (i'm on -O0 though). Old webkit.git,
>> > before:
>>
>> (it's "git status" by the way)
>>
>> >
>> > real 0m6.418s
>> > user 0m5.561s
>> > sys 0m0.827s
>> >
>> > after:
>> >
>> > real 0m5.262s
>> > user 0m4.407s
>> > sys 0m0.850s
>>
>> and with your patch to redistribute .gitignore in webkit, so top-level
>> is small again:
>>
>> real 0m4.185s
>> user 0m3.271s
>> sys 0m0.905s
>>
>> so my numbers look "ok".
>
> Is that last number just with the redistribution, or with the
> redistribution _and_ your patch? I'd like to see both to see whether it
> is the case that the two optimizations work together for combined
> benefit, or if doing one makes the other one pointless.
without my patch. redistribution with my patch is:
real 0m4.284s
user 0m3.407s
sys 0m0.864s
--
Duy
^ permalink raw reply [flat|nested] 14+ messages in thread