git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH/RFC 0/4] some attr optimizations
@ 2014-12-09 13:53 Nguyễn Thái Ngọc Duy
  2014-12-09 13:53 ` [PATCH 1/4] attr.c: rename global var attr_nr to git_attr_nr Nguyễn Thái Ngọc Duy
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-09 13:53 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

I wondered if I could mark some untracked files but 'precious' using
git attributes. Then I worried that the majority of people who don't
care about this precious thing will have to pay for git_check_attr()
just because some people want it.

Which led me to try to optimize the attr machinery so that, if there's
no sign of "precious" (or any attribute of interest) being defined, we
can keep the overhead down to minimum. The test suite passes, but
these optimizations could break down in subtle ways...

Nguyễn Thái Ngọc Duy (4):
  attr.c: rename global var attr_nr to git_attr_nr
  attr.c: split path processing code out of collect_all_attrs()
  attr: do not attempt to expand when we know it's not a macro
  attr: avoid heavy work when we know the specified attr is not defined

 attr.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 74 insertions(+), 18 deletions(-)

-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] attr.c: rename global var attr_nr to git_attr_nr
  2014-12-09 13:53 [PATCH/RFC 0/4] some attr optimizations Nguyễn Thái Ngọc Duy
@ 2014-12-09 13:53 ` Nguyễn Thái Ngọc Duy
  2014-12-09 23:54   ` Junio C Hamano
  2014-12-09 13:53 ` [PATCH 2/4] attr.c: split path processing code out of collect_all_attrs() Nguyễn Thái Ngọc Duy
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-09 13:53 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

This name "attr_nr" is used elsewhere as local variable, shadowing the
global one. Let's rename the global one into something else to avoid
accidents due to shadow in future.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/attr.c b/attr.c
index cd54697..583d36a 100644
--- a/attr.c
+++ b/attr.c
@@ -34,7 +34,7 @@ struct git_attr {
 	int attr_nr;
 	char name[FLEX_ARRAY];
 };
-static int attr_nr;
+static int git_attr_nr;
 
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
@@ -94,10 +94,10 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->name[len] = 0;
 	a->h = hval;
 	a->next = git_attr_hash[pos];
-	a->attr_nr = attr_nr++;
+	a->attr_nr = git_attr_nr++;
 	git_attr_hash[pos] = a;
 
-	REALLOC_ARRAY(check_all_attr, attr_nr);
+	REALLOC_ARRAY(check_all_attr, git_attr_nr);
 	check_all_attr[a->attr_nr].attr = a;
 	check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
 	return a;
@@ -730,10 +730,10 @@ static void collect_all_attrs(const char *path)
 	}
 
 	prepare_attr_stack(path, dirlen);
-	for (i = 0; i < attr_nr; i++)
+	for (i = 0; i < git_attr_nr; i++)
 		check_all_attr[i].value = ATTR__UNKNOWN;
 
-	rem = attr_nr;
+	rem = git_attr_nr;
 	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
@@ -762,7 +762,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
 
 	/* Count the number of attributes that are set. */
 	count = 0;
-	for (i = 0; i < attr_nr; i++) {
+	for (i = 0; i < git_attr_nr; i++) {
 		const char *value = check_all_attr[i].value;
 		if (value != ATTR__UNSET && value != ATTR__UNKNOWN)
 			++count;
@@ -770,7 +770,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
 	*num = count;
 	*check = xmalloc(sizeof(**check) * count);
 	j = 0;
-	for (i = 0; i < attr_nr; i++) {
+	for (i = 0; i < git_attr_nr; i++) {
 		const char *value = check_all_attr[i].value;
 		if (value != ATTR__UNSET && value != ATTR__UNKNOWN) {
 			(*check)[j].attr = check_all_attr[i].attr;
-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] attr.c: split path processing code out of collect_all_attrs()
  2014-12-09 13:53 [PATCH/RFC 0/4] some attr optimizations Nguyễn Thái Ngọc Duy
  2014-12-09 13:53 ` [PATCH 1/4] attr.c: rename global var attr_nr to git_attr_nr Nguyễn Thái Ngọc Duy
@ 2014-12-09 13:53 ` Nguyễn Thái Ngọc Duy
  2014-12-09 13:53 ` [PATCH/RFC 3/4] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-09 13:53 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/attr.c b/attr.c
index 583d36a..def09c7 100644
--- a/attr.c
+++ b/attr.c
@@ -705,14 +705,9 @@ static int macroexpand_one(int attr_nr, int rem)
 	return rem;
 }
 
-/*
- * Collect all attributes for path into the array pointed to by
- * check_all_attr.
- */
-static void collect_all_attrs(const char *path)
+static int split_path(const char *path, int *dirlen_p, int *basename_offset_p)
 {
-	struct attr_stack *stk;
-	int i, pathlen, rem, dirlen;
+	int pathlen, dirlen;
 	const char *cp, *last_slash = NULL;
 	int basename_offset;
 
@@ -728,7 +723,22 @@ static void collect_all_attrs(const char *path)
 		basename_offset = 0;
 		dirlen = 0;
 	}
+	*dirlen_p = dirlen;
+	*basename_offset_p = basename_offset;
+	return pathlen;
+}
+
+/*
+ * Collect all attributes for path into the array pointed to by
+ * check_all_attr.
+ */
+static void collect_all_attrs(const char *path)
+{
+	struct attr_stack *stk;
+	int i, pathlen, rem, dirlen;
+	int basename_offset;
 
+	pathlen = split_path(path, &dirlen, &basename_offset);
 	prepare_attr_stack(path, dirlen);
 	for (i = 0; i < git_attr_nr; i++)
 		check_all_attr[i].value = ATTR__UNKNOWN;
-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH/RFC 3/4] attr: do not attempt to expand when we know it's not a macro
  2014-12-09 13:53 [PATCH/RFC 0/4] some attr optimizations Nguyễn Thái Ngọc Duy
  2014-12-09 13:53 ` [PATCH 1/4] attr.c: rename global var attr_nr to git_attr_nr Nguyễn Thái Ngọc Duy
  2014-12-09 13:53 ` [PATCH 2/4] attr.c: split path processing code out of collect_all_attrs() Nguyễn Thái Ngọc Duy
@ 2014-12-09 13:53 ` Nguyễn Thái Ngọc Duy
  2014-12-09 23:27   ` Eric Sunshine
  2014-12-09 23:56   ` Junio C Hamano
  2014-12-09 13:53 ` [PATCH/RFC 4/4] attr: avoid heavy work when we know the specified attr is not defined Nguyễn Thái Ngọc Duy
  2014-12-27 23:39 ` [PATCH v2 0/3] some attr optimizations Nguyễn Thái Ngọc Duy
  4 siblings, 2 replies; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-09 13:53 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Keep track of all recognized macros in the new "maybe_macro" field.
This this field is true, it _may_ be a macro (depending on what's in
the current attr stack). But if the field is false, it's definitely
not a macro, no need to go through the whole attr stack in
macroexpand_one() to search for one.

Without this, "git grep abcdefghi" on git.git hits the inner loop in
macroexpand_one() about 2500 times. With this, it's about 60 times.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index def09c7..4ec6186 100644
--- a/attr.c
+++ b/attr.c
@@ -32,6 +32,7 @@ struct git_attr {
 	struct git_attr *next;
 	unsigned h;
 	int attr_nr;
+	int maybe_macro;
 	char name[FLEX_ARRAY];
 };
 static int git_attr_nr;
@@ -95,6 +96,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->h = hval;
 	a->next = git_attr_hash[pos];
 	a->attr_nr = git_attr_nr++;
+	a->maybe_macro = 0;
 	git_attr_hash[pos] = a;
 
 	REALLOC_ARRAY(check_all_attr, git_attr_nr);
@@ -244,9 +246,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		      sizeof(*res) +
 		      sizeof(struct attr_state) * num_attr +
 		      (is_macro ? 0 : namelen + 1));
-	if (is_macro)
+	if (is_macro) {
 		res->u.attr = git_attr_internal(name, namelen);
-	else {
+		res->u.attr->maybe_macro = 1;
+	} else {
 		char *p = (char *)&(res->state[num_attr]);
 		memcpy(p, name, namelen);
 		res->u.pat.pattern = p;
@@ -687,7 +690,8 @@ static int macroexpand_one(int attr_nr, int rem)
 	struct match_attr *a = NULL;
 	int i;
 
-	if (check_all_attr[attr_nr].value != ATTR__TRUE)
+	if (check_all_attr[attr_nr].value != ATTR__TRUE ||
+	    !check_all_attr[attr_nr].attr->maybe_macro)
 		return rem;
 
 	for (stk = attr_stack; !a && stk; stk = stk->prev)
-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH/RFC 4/4] attr: avoid heavy work when we know the specified attr is not defined
  2014-12-09 13:53 [PATCH/RFC 0/4] some attr optimizations Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2014-12-09 13:53 ` [PATCH/RFC 3/4] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
@ 2014-12-09 13:53 ` Nguyễn Thái Ngọc Duy
  2014-12-10  0:18   ` Junio C Hamano
  2014-12-27 23:39 ` [PATCH v2 0/3] some attr optimizations Nguyễn Thái Ngọc Duy
  4 siblings, 1 reply; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-09 13:53 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

If we have never seen attr 'X' in any .gitattributes file we have
examined so far, we can be sure that 'X' is not defined. So no need to
go over all the attr stack to look for attr 'X'. This is the purpose
behind this new field maybe_real.

This optimization breaks down if macros are involved because we can't
know for sure what macro would expand to 'X' at attr parsing time. But
if we go the permisstic way and assume all macros are expanded, we hit
the builtin "binary" macro. At least the "diff" attr defined in this
macro will disable this optimization for git-grep. So we wait until
any attr lines _may_ reference to a macro before we turn this off.

In git.git, this reduces the number of fill_one() call for "git grep
abcdefghi" from ~5300 to 3000. The optimization stops when it reads
t/.gitattributes, which uses 'binary' macro.

"git grep" is actually a good example to justify this patch. The
command checks "diff" attribute on every file. People usually don't
define this attribute. But they pay the attr lookup penalty anyway
without this patch, proportional to the number of attr lines they have
in repo.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 4ec6186..ba41761 100644
--- a/attr.c
+++ b/attr.c
@@ -33,9 +33,11 @@ struct git_attr {
 	unsigned h;
 	int attr_nr;
 	int maybe_macro;
+	int maybe_real;
 	char name[FLEX_ARRAY];
 };
 static int git_attr_nr;
+static int cannot_trust_maybe_real;
 
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
@@ -97,6 +99,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->next = git_attr_hash[pos];
 	a->attr_nr = git_attr_nr++;
 	a->maybe_macro = 0;
+	a->maybe_real = 0;
 	git_attr_hash[pos] = a;
 
 	REALLOC_ARRAY(check_all_attr, git_attr_nr);
@@ -269,6 +272,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	/* Second pass to fill the attr_states */
 	for (cp = states, i = 0; *cp; i++) {
 		cp = parse_attr(src, lineno, cp, &(res->state[i]));
+		if (!is_macro)
+			res->state[i].attr->maybe_real = 1;
+		if (res->state[i].attr->maybe_macro)
+			cannot_trust_maybe_real = 1;
 	}
 
 	return res;
@@ -752,11 +759,46 @@ static void collect_all_attrs(const char *path)
 		rem = fill(path, pathlen, basename_offset, stk, rem);
 }
 
+static void collect_selected_attrs(const char *path, int num,
+				   struct git_attr_check *check)
+{
+	struct attr_stack *stk;
+	int i, pathlen, rem, dirlen;
+	int basename_offset;
+
+	pathlen = split_path(path, &dirlen, &basename_offset);
+	prepare_attr_stack(path, dirlen);
+	if (cannot_trust_maybe_real) {
+		for (i = 0; i < git_attr_nr; i++)
+			check_all_attr[i].value = ATTR__UNKNOWN;
+	} else {
+		rem = num;
+		for (i = 0; i < num; i++) {
+			struct git_attr_check *c;
+			c = check_all_attr + check[i].attr->attr_nr;
+			if (check[i].attr->maybe_real)
+				c->value = ATTR__UNKNOWN;
+			else {
+				c->value = ATTR__UNSET;
+				rem--;
+			}
+		}
+		if (!rem)
+			return;
+	}
+	rem = git_attr_nr;
+	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
+		rem = fill(path, pathlen, basename_offset, stk, rem);
+}
+
 int git_check_attr(const char *path, int num, struct git_attr_check *check)
 {
 	int i;
 
-	collect_all_attrs(path);
+	if (cannot_trust_maybe_real)
+		collect_all_attrs(path);
+	else
+		collect_selected_attrs(path, num, check);
 
 	for (i = 0; i < num; i++) {
 		const char *value = check_all_attr[check[i].attr->attr_nr].value;
-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH/RFC 3/4] attr: do not attempt to expand when we know it's not a macro
  2014-12-09 13:53 ` [PATCH/RFC 3/4] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
@ 2014-12-09 23:27   ` Eric Sunshine
  2014-12-09 23:56   ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Sunshine @ 2014-12-09 23:27 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: Git List

On Tue, Dec 9, 2014 at 8:53 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> Keep track of all recognized macros in the new "maybe_macro" field.
> This this field is true, it _may_ be a macro (depending on what's in

s/This this/If this/

> the current attr stack). But if the field is false, it's definitely
> not a macro, no need to go through the whole attr stack in
> macroexpand_one() to search for one.
>
> Without this, "git grep abcdefghi" on git.git hits the inner loop in
> macroexpand_one() about 2500 times. With this, it's about 60 times.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] attr.c: rename global var attr_nr to git_attr_nr
  2014-12-09 13:53 ` [PATCH 1/4] attr.c: rename global var attr_nr to git_attr_nr Nguyễn Thái Ngọc Duy
@ 2014-12-09 23:54   ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2014-12-09 23:54 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> This name "attr_nr" is used elsewhere as local variable, shadowing the
> global one. Let's rename the global one into something else to avoid
> accidents due to shadow in future.

Even though I do not think I would reject a patch to add this entire
file if the variable were named git_attr_nr from day one (read: the
result of the patch is not wrong per-se), I am not sure if the
variable deserves "git_" prefix that makes it look as if it may need
to be protected from global namespace contamination, given that this
is a file-scope static.

It might make a lot more sense not to do this rename, but change the
first parameter of macroexpand_one() from attr_nr to something more
appropriate, like "struct git_attr *attr"?

>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  attr.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/attr.c b/attr.c
> index cd54697..583d36a 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -34,7 +34,7 @@ struct git_attr {
>  	int attr_nr;
>  	char name[FLEX_ARRAY];
>  };
> -static int attr_nr;
> +static int git_attr_nr;
>  
>  static struct git_attr_check *check_all_attr;
>  static struct git_attr *(git_attr_hash[HASHSIZE]);
> @@ -94,10 +94,10 @@ static struct git_attr *git_attr_internal(const char *name, int len)
>  	a->name[len] = 0;
>  	a->h = hval;
>  	a->next = git_attr_hash[pos];
> -	a->attr_nr = attr_nr++;
> +	a->attr_nr = git_attr_nr++;
>  	git_attr_hash[pos] = a;
>  
> -	REALLOC_ARRAY(check_all_attr, attr_nr);
> +	REALLOC_ARRAY(check_all_attr, git_attr_nr);
>  	check_all_attr[a->attr_nr].attr = a;
>  	check_all_attr[a->attr_nr].value = ATTR__UNKNOWN;
>  	return a;
> @@ -730,10 +730,10 @@ static void collect_all_attrs(const char *path)
>  	}
>  
>  	prepare_attr_stack(path, dirlen);
> -	for (i = 0; i < attr_nr; i++)
> +	for (i = 0; i < git_attr_nr; i++)
>  		check_all_attr[i].value = ATTR__UNKNOWN;
>  
> -	rem = attr_nr;
> +	rem = git_attr_nr;
>  	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
>  		rem = fill(path, pathlen, basename_offset, stk, rem);
>  }
> @@ -762,7 +762,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
>  
>  	/* Count the number of attributes that are set. */
>  	count = 0;
> -	for (i = 0; i < attr_nr; i++) {
> +	for (i = 0; i < git_attr_nr; i++) {
>  		const char *value = check_all_attr[i].value;
>  		if (value != ATTR__UNSET && value != ATTR__UNKNOWN)
>  			++count;
> @@ -770,7 +770,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
>  	*num = count;
>  	*check = xmalloc(sizeof(**check) * count);
>  	j = 0;
> -	for (i = 0; i < attr_nr; i++) {
> +	for (i = 0; i < git_attr_nr; i++) {
>  		const char *value = check_all_attr[i].value;
>  		if (value != ATTR__UNSET && value != ATTR__UNKNOWN) {
>  			(*check)[j].attr = check_all_attr[i].attr;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH/RFC 3/4] attr: do not attempt to expand when we know it's not a macro
  2014-12-09 13:53 ` [PATCH/RFC 3/4] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
  2014-12-09 23:27   ` Eric Sunshine
@ 2014-12-09 23:56   ` Junio C Hamano
  1 sibling, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2014-12-09 23:56 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> Keep track of all recognized macros in the new "maybe_macro" field.
> This this field is true, it _may_ be a macro (depending on what's in
> the current attr stack). But if the field is false, it's definitely
> not a macro, no need to go through the whole attr stack in
> macroexpand_one() to search for one.
>
> Without this, "git grep abcdefghi" on git.git hits the inner loop in
> macroexpand_one() about 2500 times. With this, it's about 60 times.

Nice ;-)

>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  attr.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/attr.c b/attr.c
> index def09c7..4ec6186 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -32,6 +32,7 @@ struct git_attr {
>  	struct git_attr *next;
>  	unsigned h;
>  	int attr_nr;
> +	int maybe_macro;
>  	char name[FLEX_ARRAY];
>  };
>  static int git_attr_nr;
> @@ -95,6 +96,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
>  	a->h = hval;
>  	a->next = git_attr_hash[pos];
>  	a->attr_nr = git_attr_nr++;
> +	a->maybe_macro = 0;
>  	git_attr_hash[pos] = a;
>  
>  	REALLOC_ARRAY(check_all_attr, git_attr_nr);
> @@ -244,9 +246,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
>  		      sizeof(*res) +
>  		      sizeof(struct attr_state) * num_attr +
>  		      (is_macro ? 0 : namelen + 1));
> -	if (is_macro)
> +	if (is_macro) {
>  		res->u.attr = git_attr_internal(name, namelen);
> -	else {
> +		res->u.attr->maybe_macro = 1;
> +	} else {
>  		char *p = (char *)&(res->state[num_attr]);
>  		memcpy(p, name, namelen);
>  		res->u.pat.pattern = p;
> @@ -687,7 +690,8 @@ static int macroexpand_one(int attr_nr, int rem)
>  	struct match_attr *a = NULL;
>  	int i;
>  
> -	if (check_all_attr[attr_nr].value != ATTR__TRUE)
> +	if (check_all_attr[attr_nr].value != ATTR__TRUE ||
> +	    !check_all_attr[attr_nr].attr->maybe_macro)
>  		return rem;
>  
>  	for (stk = attr_stack; !a && stk; stk = stk->prev)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH/RFC 4/4] attr: avoid heavy work when we know the specified attr is not defined
  2014-12-09 13:53 ` [PATCH/RFC 4/4] attr: avoid heavy work when we know the specified attr is not defined Nguyễn Thái Ngọc Duy
@ 2014-12-10  0:18   ` Junio C Hamano
  2014-12-15  0:50     ` Duy Nguyen
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2014-12-10  0:18 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> +static void collect_selected_attrs(const char *path, int num,
> +				   struct git_attr_check *check)
> +{
> +	struct attr_stack *stk;
> +	int i, pathlen, rem, dirlen;
> +	int basename_offset;
> +
> +	pathlen = split_path(path, &dirlen, &basename_offset);
> +	prepare_attr_stack(path, dirlen);
> +	if (cannot_trust_maybe_real) {
> +		for (i = 0; i < git_attr_nr; i++)
> +			check_all_attr[i].value = ATTR__UNKNOWN;

Judging from the fact that

 (1) the only caller calls this function in this fashion based on the
     setting of "cannot-trust" bit,

 (2) this and the other function the only caller calls share the
     same code in their beginning part, and

 (3) the body of the if() statement here duplicates the code from
     collect_all_attrs(),

I smell that a much better split is possible.

Why isn't this all inside a single function collect_all_attrs()?
That single function may no longer be collect_ALL_attrs, so renaming
it to collect_attrs() is fine, but then that function may have this
if () to initialize all of them to ATTR__UNKNOWN or do the else part
we see below, and when organized that way we do not need to have
duplicated code (or split_path() helper function), no?

> +	} else {
> +		rem = num;
> +		for (i = 0; i < num; i++) {
> +			struct git_attr_check *c;
> +			c = check_all_attr + check[i].attr->attr_nr;
> +			if (check[i].attr->maybe_real)
> +				c->value = ATTR__UNKNOWN;
> +			else {
> +				c->value = ATTR__UNSET;
> +				rem--;
> +			}
> +		}
> +		if (!rem)
> +			return;
> +	}
> +	rem = git_attr_nr;
> +	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
> +		rem = fill(path, pathlen, basename_offset, stk, rem);
> +}
> +
>  int git_check_attr(const char *path, int num, struct git_attr_check *check)
>  {
>  	int i;
>  
> -	collect_all_attrs(path);
> +	if (cannot_trust_maybe_real)
> +		collect_all_attrs(path);
> +	else
> +		collect_selected_attrs(path, num, check);
>  
>  	for (i = 0; i < num; i++) {
>  		const char *value = check_all_attr[check[i].attr->attr_nr].value;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH/RFC 4/4] attr: avoid heavy work when we know the specified attr is not defined
  2014-12-10  0:18   ` Junio C Hamano
@ 2014-12-15  0:50     ` Duy Nguyen
  2014-12-15 17:30       ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Duy Nguyen @ 2014-12-15  0:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Dec 09, 2014 at 04:18:57PM -0800, Junio C Hamano wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
> 
> > +static void collect_selected_attrs(const char *path, int num,
> > +				   struct git_attr_check *check)
> > +{
> > +	struct attr_stack *stk;
> > +	int i, pathlen, rem, dirlen;
> > +	int basename_offset;
> > +
> > +	pathlen = split_path(path, &dirlen, &basename_offset);
> > +	prepare_attr_stack(path, dirlen);
> > +	if (cannot_trust_maybe_real) {
> > +		for (i = 0; i < git_attr_nr; i++)
> > +			check_all_attr[i].value = ATTR__UNKNOWN;
> 
> Judging from the fact that
> 
>  (1) the only caller calls this function in this fashion based on the
>      setting of "cannot-trust" bit,
> 
>  (2) this and the other function the only caller calls share the
>      same code in their beginning part, and
> 
>  (3) the body of the if() statement here duplicates the code from
>      collect_all_attrs(),
> 
> I smell that a much better split is possible.
> 
> Why isn't this all inside a single function collect_all_attrs()?
> That single function may no longer be collect_ALL_attrs, so renaming
> it to collect_attrs() is fine, but then that function may have this
> if () to initialize all of them to ATTR__UNKNOWN or do the else part
> we see below, and when organized that way we do not need to have
> duplicated code (or split_path() helper function), no?

Something like this? Definitely looks better.

-- 8< --
diff --git a/attr.c b/attr.c
index b80e52b..0f828e3 100644
--- a/attr.c
+++ b/attr.c
@@ -33,9 +33,11 @@ struct git_attr {
 	unsigned h;
 	int attr_nr;
 	int maybe_macro;
+	int maybe_real;
 	char name[FLEX_ARRAY];
 };
 static int attr_nr;
+static int cannot_trust_maybe_real;
 
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
@@ -97,6 +99,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->next = git_attr_hash[pos];
 	a->attr_nr = attr_nr++;
 	a->maybe_macro = 0;
+	a->maybe_real = 0;
 	git_attr_hash[pos] = a;
 
 	REALLOC_ARRAY(check_all_attr, attr_nr);
@@ -269,6 +272,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	/* Second pass to fill the attr_states */
 	for (cp = states, i = 0; *cp; i++) {
 		cp = parse_attr(src, lineno, cp, &(res->state[i]));
+		if (!is_macro)
+			res->state[i].attr->maybe_real = 1;
+		if (res->state[i].attr->maybe_macro)
+			cannot_trust_maybe_real = 1;
 	}
 
 	return res;
@@ -713,7 +720,9 @@ static int macroexpand_one(int nr, int rem)
  * Collect all attributes for path into the array pointed to by
  * check_all_attr.
  */
-static void collect_all_attrs(const char *path)
+static void collect_some_attrs(const char *path, int num,
+			       struct git_attr_check *check)
+
 {
 	struct attr_stack *stk;
 	int i, pathlen, rem, dirlen;
@@ -736,6 +745,19 @@ static void collect_all_attrs(const char *path)
 	prepare_attr_stack(path, dirlen);
 	for (i = 0; i < attr_nr; i++)
 		check_all_attr[i].value = ATTR__UNKNOWN;
+	if (num && !cannot_trust_maybe_real) {
+		rem = 0;
+		for (i = 0; i < num; i++) {
+			if (!check[i].attr->maybe_real) {
+				struct git_attr_check *c;
+				c = check_all_attr + check[i].attr->attr_nr;
+				c->value = ATTR__UNSET;
+				rem++;
+			}
+		}
+		if (rem == num)
+			return;
+	}
 
 	rem = attr_nr;
 	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
@@ -746,7 +768,7 @@ int git_check_attr(const char *path, int num, struct git_attr_check *check)
 {
 	int i;
 
-	collect_all_attrs(path);
+	collect_some_attrs(path, num, check);
 
 	for (i = 0; i < num; i++) {
 		const char *value = check_all_attr[check[i].attr->attr_nr].value;
@@ -762,7 +784,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
 {
 	int i, count, j;
 
-	collect_all_attrs(path);
+	collect_some_attrs(path, 0, NULL);
 
 	/* Count the number of attributes that are set. */
 	count = 0;
-- 8< --

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH/RFC 4/4] attr: avoid heavy work when we know the specified attr is not defined
  2014-12-15  0:50     ` Duy Nguyen
@ 2014-12-15 17:30       ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2014-12-15 17:30 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git

Duy Nguyen <pclouds@gmail.com> writes:

> On Tue, Dec 09, 2014 at 04:18:57PM -0800, Junio C Hamano wrote:
> ...
>> I smell that a much better split is possible.
>> ...
>
> Something like this? Definitely looks better.

Yeah, I was lazy and did not try it myself to see what the end
result would look like when I commented, but doing it this way
avoids needless repetitions.

The comment block before the collect_*_attrs function need to be
adjusted to match the updated behaviour, though.

Thanks.

> -- 8< --
> diff --git a/attr.c b/attr.c
> index b80e52b..0f828e3 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -33,9 +33,11 @@ struct git_attr {
>  	unsigned h;
>  	int attr_nr;
>  	int maybe_macro;
> +	int maybe_real;
>  	char name[FLEX_ARRAY];
>  };
>  static int attr_nr;
> +static int cannot_trust_maybe_real;
>  
>  static struct git_attr_check *check_all_attr;
>  static struct git_attr *(git_attr_hash[HASHSIZE]);
> @@ -97,6 +99,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
>  	a->next = git_attr_hash[pos];
>  	a->attr_nr = attr_nr++;
>  	a->maybe_macro = 0;
> +	a->maybe_real = 0;
>  	git_attr_hash[pos] = a;
>  
>  	REALLOC_ARRAY(check_all_attr, attr_nr);
> @@ -269,6 +272,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
>  	/* Second pass to fill the attr_states */
>  	for (cp = states, i = 0; *cp; i++) {
>  		cp = parse_attr(src, lineno, cp, &(res->state[i]));
> +		if (!is_macro)
> +			res->state[i].attr->maybe_real = 1;
> +		if (res->state[i].attr->maybe_macro)
> +			cannot_trust_maybe_real = 1;
>  	}
>  
>  	return res;
> @@ -713,7 +720,9 @@ static int macroexpand_one(int nr, int rem)
>   * Collect all attributes for path into the array pointed to by
>   * check_all_attr.
>   */
> -static void collect_all_attrs(const char *path)
> +static void collect_some_attrs(const char *path, int num,
> +			       struct git_attr_check *check)
> +
>  {
>  	struct attr_stack *stk;
>  	int i, pathlen, rem, dirlen;
> @@ -736,6 +745,19 @@ static void collect_all_attrs(const char *path)
>  	prepare_attr_stack(path, dirlen);
>  	for (i = 0; i < attr_nr; i++)
>  		check_all_attr[i].value = ATTR__UNKNOWN;
> +	if (num && !cannot_trust_maybe_real) {
> +		rem = 0;
> +		for (i = 0; i < num; i++) {
> +			if (!check[i].attr->maybe_real) {
> +				struct git_attr_check *c;
> +				c = check_all_attr + check[i].attr->attr_nr;
> +				c->value = ATTR__UNSET;
> +				rem++;
> +			}
> +		}
> +		if (rem == num)
> +			return;
> +	}
>  
>  	rem = attr_nr;
>  	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
> @@ -746,7 +768,7 @@ int git_check_attr(const char *path, int num, struct git_attr_check *check)
>  {
>  	int i;
>  
> -	collect_all_attrs(path);
> +	collect_some_attrs(path, num, check);
>  
>  	for (i = 0; i < num; i++) {
>  		const char *value = check_all_attr[check[i].attr->attr_nr].value;
> @@ -762,7 +784,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
>  {
>  	int i, count, j;
>  
> -	collect_all_attrs(path);
> +	collect_some_attrs(path, 0, NULL);
>  
>  	/* Count the number of attributes that are set. */
>  	count = 0;
> -- 8< --

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 0/3] some attr optimizations
  2014-12-09 13:53 [PATCH/RFC 0/4] some attr optimizations Nguyễn Thái Ngọc Duy
                   ` (3 preceding siblings ...)
  2014-12-09 13:53 ` [PATCH/RFC 4/4] attr: avoid heavy work when we know the specified attr is not defined Nguyễn Thái Ngọc Duy
@ 2014-12-27 23:39 ` Nguyễn Thái Ngọc Duy
  2014-12-27 23:39   ` [PATCH v2 1/3] attr.c: rename arg name attr_nr to avoid shadowing the global one Nguyễn Thái Ngọc Duy
                     ` (2 more replies)
  4 siblings, 3 replies; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-27 23:39 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

v2 updates 01/03 (rename the local var to avoid shadowing instead) and
merges 02/04 and 04/04 to 03/03.

Nguyễn Thái Ngọc Duy (3):
  attr.c: rename arg name attr_nr to avoid shadowing the global one
  attr: do not attempt to expand when we know it's not a macro
  attr: avoid heavy work when we know the specified attr is not defined

 attr.c | 47 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 37 insertions(+), 10 deletions(-)

-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/3] attr.c: rename arg name attr_nr to avoid shadowing the global one
  2014-12-27 23:39 ` [PATCH v2 0/3] some attr optimizations Nguyễn Thái Ngọc Duy
@ 2014-12-27 23:39   ` Nguyễn Thái Ngọc Duy
  2014-12-27 23:39   ` [PATCH v2 2/3] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
  2014-12-27 23:39   ` [PATCH v2 3/3] attr: avoid heavy work when we know the specified attr is not defined Nguyễn Thái Ngọc Duy
  2 siblings, 0 replies; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-27 23:39 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index cd54697..a1758bf 100644
--- a/attr.c
+++ b/attr.c
@@ -681,13 +681,13 @@ static int fill(const char *path, int pathlen, int basename_offset,
 	return rem;
 }
 
-static int macroexpand_one(int attr_nr, int rem)
+static int macroexpand_one(int nr, int rem)
 {
 	struct attr_stack *stk;
 	struct match_attr *a = NULL;
 	int i;
 
-	if (check_all_attr[attr_nr].value != ATTR__TRUE)
+	if (check_all_attr[nr].value != ATTR__TRUE)
 		return rem;
 
 	for (stk = attr_stack; !a && stk; stk = stk->prev)
@@ -695,7 +695,7 @@ static int macroexpand_one(int attr_nr, int rem)
 			struct match_attr *ma = stk->attrs[i];
 			if (!ma->is_macro)
 				continue;
-			if (ma->u.attr->attr_nr == attr_nr)
+			if (ma->u.attr->attr_nr == nr)
 				a = ma;
 		}
 
-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/3] attr: do not attempt to expand when we know it's not a macro
  2014-12-27 23:39 ` [PATCH v2 0/3] some attr optimizations Nguyễn Thái Ngọc Duy
  2014-12-27 23:39   ` [PATCH v2 1/3] attr.c: rename arg name attr_nr to avoid shadowing the global one Nguyễn Thái Ngọc Duy
@ 2014-12-27 23:39   ` Nguyễn Thái Ngọc Duy
  2014-12-27 23:59     ` Eric Sunshine
  2014-12-27 23:39   ` [PATCH v2 3/3] attr: avoid heavy work when we know the specified attr is not defined Nguyễn Thái Ngọc Duy
  2 siblings, 1 reply; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-27 23:39 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Keep track of all recognized macros in the new "maybe_macro" field.
If this field is true, it _may_ be a macro (depending on what's in the
current attr stack). But if the field is false, it's definitely not a
macro, no need to go through the whole attr stack in macroexpand_one()
to search for one.

Without this, "git grep abcdefghi" on git.git hits the inner loop in
macroexpand_one() 2481 times. With this, it's 66 times.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/attr.c b/attr.c
index a1758bf..b80e52b 100644
--- a/attr.c
+++ b/attr.c
@@ -32,6 +32,7 @@ struct git_attr {
 	struct git_attr *next;
 	unsigned h;
 	int attr_nr;
+	int maybe_macro;
 	char name[FLEX_ARRAY];
 };
 static int attr_nr;
@@ -95,6 +96,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->h = hval;
 	a->next = git_attr_hash[pos];
 	a->attr_nr = attr_nr++;
+	a->maybe_macro = 0;
 	git_attr_hash[pos] = a;
 
 	REALLOC_ARRAY(check_all_attr, attr_nr);
@@ -244,9 +246,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 		      sizeof(*res) +
 		      sizeof(struct attr_state) * num_attr +
 		      (is_macro ? 0 : namelen + 1));
-	if (is_macro)
+	if (is_macro) {
 		res->u.attr = git_attr_internal(name, namelen);
-	else {
+		res->u.attr->maybe_macro = 1;
+	} else {
 		char *p = (char *)&(res->state[num_attr]);
 		memcpy(p, name, namelen);
 		res->u.pat.pattern = p;
@@ -687,7 +690,8 @@ static int macroexpand_one(int nr, int rem)
 	struct match_attr *a = NULL;
 	int i;
 
-	if (check_all_attr[nr].value != ATTR__TRUE)
+	if (check_all_attr[nr].value != ATTR__TRUE ||
+	    !check_all_attr[nr].attr->maybe_macro)
 		return rem;
 
 	for (stk = attr_stack; !a && stk; stk = stk->prev)
-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/3] attr: avoid heavy work when we know the specified attr is not defined
  2014-12-27 23:39 ` [PATCH v2 0/3] some attr optimizations Nguyễn Thái Ngọc Duy
  2014-12-27 23:39   ` [PATCH v2 1/3] attr.c: rename arg name attr_nr to avoid shadowing the global one Nguyễn Thái Ngọc Duy
  2014-12-27 23:39   ` [PATCH v2 2/3] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
@ 2014-12-27 23:39   ` Nguyễn Thái Ngọc Duy
  2 siblings, 0 replies; 16+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-12-27 23:39 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

If we have never seen attr 'X' in any .gitattributes file we have
examined so far, we can be sure that 'X' is not defined. So no need to
go over all the attr stack to look for attr 'X'. This is the purpose
behind this new field maybe_real.

This optimization breaks down if macros are involved because we can't
know for sure what macro would expand to 'X' at attr parsing time. But
if we go the pessimistic way and assume all macros are expanded, we hit
the builtin "binary" macro. At least the "diff" attr defined in this
macro will disable this optimization for git-grep. So we wait until
any attr lines _may_ reference to a macro before we turn this off.

In git.git, this reduces the number of fill_one() call for "git grep
abcdefghi" from ~5348 to 2955. The optimization stops when it reads
t/.gitattributes, which uses 'binary' macro. We could probably reduce
it further by limiting the 'binary' reference to t/ and subdirs only
in this case.

"git grep" is actually a good example to justify this patch. The
command checks "diff" attribute on every file. People usually don't
define this attribute. But they pay the attr lookup penalty anyway
without this patch, proportional to the number of attr lines they have
in repo.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 33 ++++++++++++++++++++++++++++-----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/attr.c b/attr.c
index b80e52b..1f9eebd 100644
--- a/attr.c
+++ b/attr.c
@@ -33,9 +33,11 @@ struct git_attr {
 	unsigned h;
 	int attr_nr;
 	int maybe_macro;
+	int maybe_real;
 	char name[FLEX_ARRAY];
 };
 static int attr_nr;
+static int cannot_trust_maybe_real;
 
 static struct git_attr_check *check_all_attr;
 static struct git_attr *(git_attr_hash[HASHSIZE]);
@@ -97,6 +99,7 @@ static struct git_attr *git_attr_internal(const char *name, int len)
 	a->next = git_attr_hash[pos];
 	a->attr_nr = attr_nr++;
 	a->maybe_macro = 0;
+	a->maybe_real = 0;
 	git_attr_hash[pos] = a;
 
 	REALLOC_ARRAY(check_all_attr, attr_nr);
@@ -269,6 +272,10 @@ static struct match_attr *parse_attr_line(const char *line, const char *src,
 	/* Second pass to fill the attr_states */
 	for (cp = states, i = 0; *cp; i++) {
 		cp = parse_attr(src, lineno, cp, &(res->state[i]));
+		if (!is_macro)
+			res->state[i].attr->maybe_real = 1;
+		if (res->state[i].attr->maybe_macro)
+			cannot_trust_maybe_real = 1;
 	}
 
 	return res;
@@ -710,10 +717,13 @@ static int macroexpand_one(int nr, int rem)
 }
 
 /*
- * Collect all attributes for path into the array pointed to by
- * check_all_attr.
+ * Collect attributes for path into the array pointed to by
+ * check_all_attr. If num is non-zero, only attributes in check[] are
+ * collected. Otherwise all attributes are collected.
  */
-static void collect_all_attrs(const char *path)
+static void collect_some_attrs(const char *path, int num,
+			       struct git_attr_check *check)
+
 {
 	struct attr_stack *stk;
 	int i, pathlen, rem, dirlen;
@@ -736,6 +746,19 @@ static void collect_all_attrs(const char *path)
 	prepare_attr_stack(path, dirlen);
 	for (i = 0; i < attr_nr; i++)
 		check_all_attr[i].value = ATTR__UNKNOWN;
+	if (num && !cannot_trust_maybe_real) {
+		rem = 0;
+		for (i = 0; i < num; i++) {
+			if (!check[i].attr->maybe_real) {
+				struct git_attr_check *c;
+				c = check_all_attr + check[i].attr->attr_nr;
+				c->value = ATTR__UNSET;
+				rem++;
+			}
+		}
+		if (rem == num)
+			return;
+	}
 
 	rem = attr_nr;
 	for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
@@ -746,7 +769,7 @@ int git_check_attr(const char *path, int num, struct git_attr_check *check)
 {
 	int i;
 
-	collect_all_attrs(path);
+	collect_some_attrs(path, num, check);
 
 	for (i = 0; i < num; i++) {
 		const char *value = check_all_attr[check[i].attr->attr_nr].value;
@@ -762,7 +785,7 @@ int git_all_attrs(const char *path, int *num, struct git_attr_check **check)
 {
 	int i, count, j;
 
-	collect_all_attrs(path);
+	collect_some_attrs(path, 0, NULL);
 
 	/* Count the number of attributes that are set. */
 	count = 0;
-- 
2.2.0.84.ge9c7a8a

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] attr: do not attempt to expand when we know it's not a macro
  2014-12-27 23:39   ` [PATCH v2 2/3] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
@ 2014-12-27 23:59     ` Eric Sunshine
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Sunshine @ 2014-12-27 23:59 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: Git List

On Sat, Dec 27, 2014 at 6:39 PM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> Keep track of all recognized macros in the new "maybe_macro" field.
> If this field is true, it _may_ be a macro (depending on what's in the
> current attr stack). But if the field is false, it's definitely not a
> macro, no need to go through the whole attr stack in macroexpand_one()
> to search for one.
>
> Without this, "git grep abcdefghi" on git.git hits the inner loop in
> macroexpand_one() 2481 times. With this, it's 66 times.
>
> Helped-by: Eric Sunshine <sunshine@sunshineco.com>

My only contribution was pointing out a typographical error in the
commit message[1]; hardly worthy of a Helped-by:.

> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

[1]: http://article.gmane.org/gmane.comp.version-control.git/261177

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-12-27 23:59 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-09 13:53 [PATCH/RFC 0/4] some attr optimizations Nguyễn Thái Ngọc Duy
2014-12-09 13:53 ` [PATCH 1/4] attr.c: rename global var attr_nr to git_attr_nr Nguyễn Thái Ngọc Duy
2014-12-09 23:54   ` Junio C Hamano
2014-12-09 13:53 ` [PATCH 2/4] attr.c: split path processing code out of collect_all_attrs() Nguyễn Thái Ngọc Duy
2014-12-09 13:53 ` [PATCH/RFC 3/4] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
2014-12-09 23:27   ` Eric Sunshine
2014-12-09 23:56   ` Junio C Hamano
2014-12-09 13:53 ` [PATCH/RFC 4/4] attr: avoid heavy work when we know the specified attr is not defined Nguyễn Thái Ngọc Duy
2014-12-10  0:18   ` Junio C Hamano
2014-12-15  0:50     ` Duy Nguyen
2014-12-15 17:30       ` Junio C Hamano
2014-12-27 23:39 ` [PATCH v2 0/3] some attr optimizations Nguyễn Thái Ngọc Duy
2014-12-27 23:39   ` [PATCH v2 1/3] attr.c: rename arg name attr_nr to avoid shadowing the global one Nguyễn Thái Ngọc Duy
2014-12-27 23:39   ` [PATCH v2 2/3] attr: do not attempt to expand when we know it's not a macro Nguyễn Thái Ngọc Duy
2014-12-27 23:59     ` Eric Sunshine
2014-12-27 23:39   ` [PATCH v2 3/3] attr: avoid heavy work when we know the specified attr is not defined Nguyễn Thái Ngọc Duy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).