git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] nd/attr-match-optim-more updates
@ 2012-10-14 11:55 Nguyễn Thái Ngọc Duy
  2012-10-14 11:55 ` [PATCH 1/4] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-14 11:55 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This is on top of nd/attr-match-optim-more to fix the bug I sent
recently [1] sharing the code, thus sharing any fixes.

[1] http://thread.gmane.org/gmane.comp.version-control.git/207652

Nguyễn Thái Ngọc Duy (4):
  exclude: stricten a length check in EXC_FLAG_ENDSWITH case
  exclude: fix a bug in prefix compare optimization
  exclude/attr: share basename matching code
  exclude/attr: share full pathname matching code

 attr.c                             |  50 +++------------
 dir.c                              | 121 +++++++++++++++++++++++--------------
 dir.h                              |   5 ++
 t/t3001-ls-files-others-exclude.sh |   6 ++
 4 files changed, 95 insertions(+), 87 deletions(-)

-- 
1.8.0.rc2.11.g2b79d01

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] exclude: stricten a length check in EXC_FLAG_ENDSWITH case
  2012-10-14 11:55 [PATCH 0/4] nd/attr-match-optim-more updates Nguyễn Thái Ngọc Duy
@ 2012-10-14 11:55 ` Nguyễn Thái Ngọc Duy
  2012-10-14 11:55 ` [PATCH 2/4] exclude: fix a bug in prefix compare optimization Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-14 11:55 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

This block of code deals with the "basename" part only, which has the
length of "pathlen - (basename - pathname)". Stricten the length check
and remove "pathname" from the main expression to avoid confusion.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index cddf043..d81498e 100644
--- a/dir.c
+++ b/dir.c
@@ -561,8 +561,9 @@ int excluded_from_list(const char *pathname,
 				if (!strcmp_icase(exclude, basename))
 					return to_exclude;
 			} else if (x->flags & EXC_FLAG_ENDSWITH) {
-				if (x->patternlen - 1 <= pathlen &&
-				    !strcmp_icase(exclude + 1, pathname + pathlen - x->patternlen + 1))
+				int len = pathlen - (basename - pathname);
+				if (x->patternlen - 1 <= len &&
+				    !strcmp_icase(exclude + 1, basename + len - x->patternlen + 1))
 					return to_exclude;
 			} else {
 				if (fnmatch_icase(exclude, basename, 0) == 0)
-- 
1.8.0.rc2.11.g2b79d01

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] exclude: fix a bug in prefix compare optimization
  2012-10-14 11:55 [PATCH 0/4] nd/attr-match-optim-more updates Nguyễn Thái Ngọc Duy
  2012-10-14 11:55 ` [PATCH 1/4] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
@ 2012-10-14 11:55 ` Nguyễn Thái Ngọc Duy
  2012-10-14 11:55 ` [PATCH 3/4] exclude/attr: share basename matching code Nguyễn Thái Ngọc Duy
  2012-10-14 11:55 ` [PATCH 4/4] exclude/attr: share full pathname " Nguyễn Thái Ngọc Duy
  3 siblings, 0 replies; 6+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-14 11:55 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

When "namelen" becomes zero at this stage, we have matched the fixed
part, but whether it actually matches the pattern still depends on the
pattern in "exclude". As demonstrated in t3001, path "three/a.3"
exists and it matches the "three/a.3" part in pattern "three/a.3[abc]",
but that does not mean a true match.

Don't be too optimistic and let fnmatch() do the job.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 dir.c                              | 2 +-
 t/t3001-ls-files-others-exclude.sh | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index d81498e..0f4aea6 100644
--- a/dir.c
+++ b/dir.c
@@ -601,7 +601,7 @@ int excluded_from_list(const char *pathname,
 			namelen -= prefix;
 		}
 
-		if (!namelen || !fnmatch_icase(exclude, name, FNM_PATHNAME))
+		if (!fnmatch_icase(exclude, name, FNM_PATHNAME))
 			return to_exclude;
 	}
 	return -1; /* undecided */
diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh
index c8fe978..dc2f045 100755
--- a/t/t3001-ls-files-others-exclude.sh
+++ b/t/t3001-ls-files-others-exclude.sh
@@ -214,4 +214,10 @@ test_expect_success 'subdirectory ignore (l1)' '
 	test_cmp expect actual
 '
 
+test_expect_success 'pattern matches prefix completely' '
+	: >expect &&
+	git ls-files -i -o --exclude "/three/a.3[abc]" >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
1.8.0.rc2.11.g2b79d01

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] exclude/attr: share basename matching code
  2012-10-14 11:55 [PATCH 0/4] nd/attr-match-optim-more updates Nguyễn Thái Ngọc Duy
  2012-10-14 11:55 ` [PATCH 1/4] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
  2012-10-14 11:55 ` [PATCH 2/4] exclude: fix a bug in prefix compare optimization Nguyễn Thái Ngọc Duy
@ 2012-10-14 11:55 ` Nguyễn Thái Ngọc Duy
  2012-10-14 18:09   ` Junio C Hamano
  2012-10-14 11:55 ` [PATCH 4/4] exclude/attr: share full pathname " Nguyễn Thái Ngọc Duy
  3 siblings, 1 reply; 6+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-14 11:55 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

match_basename's declaration in dir.h does not have any description to
discourage the use of this function elsewhere as this function is
highly tied to how excluded_from_list and path_matches work.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 15 ++++-----------
 dir.c  | 37 ++++++++++++++++++++++++-------------
 dir.h  |  2 ++
 3 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/attr.c b/attr.c
index 0964033..a28ce0d 100644
--- a/attr.c
+++ b/attr.c
@@ -663,17 +663,10 @@ static int path_matches(const char *pathname, int pathlen,
 	int namelen;
 
 	if (pat->flags & EXC_FLAG_NODIR) {
-		if (prefix == pat->patternlen &&
-		    !strcmp_icase(pattern, basename))
-			return 1;
-
-		if (pat->flags & EXC_FLAG_ENDSWITH &&
-		    pat->patternlen - 1 <= pathlen &&
-		    !strcmp_icase(pattern + 1, pathname +
-				  pathlen - pat->patternlen + 1))
-			return 1;
-
-		return (fnmatch_icase(pattern, basename, 0) == 0);
+		return match_basename(basename,
+				      pathlen - (basename - pathname),
+				      pattern, prefix,
+				      pat->patternlen, pat->flags);
 	}
 	/*
 	 * match with FNM_PATHNAME; the pattern has base implicitly
diff --git a/dir.c b/dir.c
index 0f4aea6..42c42cd 100644
--- a/dir.c
+++ b/dir.c
@@ -530,6 +530,25 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 	dir->basebuf[baselen] = '\0';
 }
 
+int match_basename(const char *basename, int basenamelen,
+		   const char *pattern, int prefix, int patternlen,
+		   int flags)
+{
+	if (prefix == patternlen) {
+		if (!strcmp_icase(pattern, basename))
+			return 1;
+	} else if (flags & EXC_FLAG_ENDSWITH) {
+		if (patternlen - 1 <= basenamelen &&
+		    !strcmp_icase(pattern + 1,
+				  basename + basenamelen - patternlen + 1))
+			return 1;
+	} else {
+		if (fnmatch_icase(pattern, basename, 0) == 0)
+			return 1;
+	}
+	return 0;
+}
+
 /* Scan the list and let the last match determine the fate.
  * Return 1 for exclude, 0 for include and -1 for undecided.
  */
@@ -556,19 +575,11 @@ int excluded_from_list(const char *pathname,
 		}
 
 		if (x->flags & EXC_FLAG_NODIR) {
-			/* match basename */
-			if (prefix == x->patternlen) {
-				if (!strcmp_icase(exclude, basename))
-					return to_exclude;
-			} else if (x->flags & EXC_FLAG_ENDSWITH) {
-				int len = pathlen - (basename - pathname);
-				if (x->patternlen - 1 <= len &&
-				    !strcmp_icase(exclude + 1, basename + len - x->patternlen + 1))
-					return to_exclude;
-			} else {
-				if (fnmatch_icase(exclude, basename, 0) == 0)
-					return to_exclude;
-			}
+			if (match_basename(basename,
+					   pathlen - (basename - pathname),
+					   exclude, prefix, x->patternlen,
+					   x->flags))
+				return to_exclude;
 			continue;
 		}
 
diff --git a/dir.h b/dir.h
index fd5c2aa..d416c5a 100644
--- a/dir.h
+++ b/dir.h
@@ -78,6 +78,8 @@ extern int read_directory(struct dir_struct *, const char *path, int len, const
 
 extern int excluded_from_list(const char *pathname, int pathlen, const char *basename,
 			      int *dtype, struct exclude_list *el);
+extern int match_basename(const char *, int,
+			  const char *, int, int, int);
 struct dir_entry *dir_add_ignored(struct dir_struct *dir, const char *pathname, int len);
 
 /*
-- 
1.8.0.rc2.11.g2b79d01

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] exclude/attr: share full pathname matching code
  2012-10-14 11:55 [PATCH 0/4] nd/attr-match-optim-more updates Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2012-10-14 11:55 ` [PATCH 3/4] exclude/attr: share basename matching code Nguyễn Thái Ngọc Duy
@ 2012-10-14 11:55 ` Nguyễn Thái Ngọc Duy
  3 siblings, 0 replies; 6+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2012-10-14 11:55 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

match_pathname's declaration in dir.h does not have any description to
discourage the use of this function elsewhere as this function is
highly tied to how excluded_from_list and path_matches work.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 attr.c | 35 +++------------------------
 dir.c  | 85 +++++++++++++++++++++++++++++++++++++++++-------------------------
 dir.h  |  3 +++
 3 files changed, 59 insertions(+), 64 deletions(-)

diff --git a/attr.c b/attr.c
index a28ce0d..2fc6353 100644
--- a/attr.c
+++ b/attr.c
@@ -659,8 +659,6 @@ static int path_matches(const char *pathname, int pathlen,
 {
 	const char *pattern = pat->pattern;
 	int prefix = pat->nowildcardlen;
-	const char *name;
-	int namelen;
 
 	if (pat->flags & EXC_FLAG_NODIR) {
 		return match_basename(basename,
@@ -668,36 +666,9 @@ static int path_matches(const char *pathname, int pathlen,
 				      pattern, prefix,
 				      pat->patternlen, pat->flags);
 	}
-	/*
-	 * match with FNM_PATHNAME; the pattern has base implicitly
-	 * in front of it.
-	 */
-	if (*pattern == '/') {
-		pattern++;
-		prefix--;
-	}
-
-	/*
-	 * note: unlike excluded_from_list, baselen here does not
-	 * count the trailing slash, and base[] does not end with
-	 * a trailing slash, either.
-	 */
-	if (pathlen < baselen + 1 ||
-	    (baselen && pathname[baselen] != '/') ||
-	    strncmp_icase(pathname, base, baselen))
-		return 0;
-
-	namelen = baselen ? pathlen - baselen - 1 : pathlen;
-	name = pathname + pathlen - namelen;
-
-	/*
-	 * if the non-wildcard part is longer than the remaining
-	 * pathname, surely it cannot match.
-	 */
-	if (!namelen || prefix > namelen)
-		return 0;
-
-	return fnmatch_icase(pattern, name, FNM_PATHNAME) == 0;
+	return match_pathname(pathname, pathlen,
+			      base, baselen,
+			      pattern, prefix, pat->patternlen, pat->flags);
 }
 
 static int macroexpand_one(int attr_nr, int rem);
diff --git a/dir.c b/dir.c
index 42c42cd..ee8e711 100644
--- a/dir.c
+++ b/dir.c
@@ -549,6 +549,53 @@ int match_basename(const char *basename, int basenamelen,
 	return 0;
 }
 
+int match_pathname(const char *pathname, int pathlen,
+		   const char *base, int baselen,
+		   const char *pattern, int prefix, int patternlen,
+		   int flags)
+{
+	const char *name;
+	int namelen;
+
+	/*
+	 * match with FNM_PATHNAME; the pattern has base implicitly
+	 * in front of it.
+	 */
+	if (*pattern == '/') {
+		pattern++;
+		prefix--;
+	}
+
+	/*
+	 * baselen does not count the trailing slash. base[] may or
+	 * may not end with a trailing slash though.
+	 */
+	if (pathlen < baselen + 1 ||
+	    (baselen && pathname[baselen] != '/') ||
+	    strncmp_icase(pathname, base, baselen))
+		return 0;
+
+	namelen = baselen ? pathlen - baselen - 1 : pathlen;
+	name = pathname + pathlen - namelen;
+
+	if (prefix) {
+		/*
+		 * if the non-wildcard part is longer than the
+		 * remaining pathname, surely it cannot match.
+		 */
+		if (prefix > namelen)
+			return 0;
+
+		if (strncmp_icase(pattern, name, prefix))
+			return 0;
+		pattern += prefix;
+		name    += prefix;
+		namelen -= prefix;
+	}
+
+	return fnmatch_icase(pattern, name, FNM_PATHNAME) == 0;
+}
+
 /* Scan the list and let the last match determine the fate.
  * Return 1 for exclude, 0 for include and -1 for undecided.
  */
@@ -563,9 +610,9 @@ int excluded_from_list(const char *pathname,
 
 	for (i = el->nr - 1; 0 <= i; i--) {
 		struct exclude *x = el->excludes[i];
-		const char *name, *exclude = x->pattern;
+		const char *exclude = x->pattern;
 		int to_exclude = x->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
-		int namelen, prefix = x->nowildcardlen;
+		int prefix = x->nowildcardlen;
 
 		if (x->flags & EXC_FLAG_MUSTBEDIR) {
 			if (*dtype == DT_UNKNOWN)
@@ -583,36 +630,10 @@ int excluded_from_list(const char *pathname,
 			continue;
 		}
 
-		/* match with FNM_PATHNAME:
-		 * exclude has base (baselen long) implicitly in front of it.
-		 */
-		if (*exclude == '/') {
-			exclude++;
-			prefix--;
-		}
-
-		if (pathlen < x->baselen ||
-		    (x->baselen && pathname[x->baselen-1] != '/') ||
-		    strncmp_icase(pathname, x->base, x->baselen))
-			continue;
-
-		namelen = x->baselen ? pathlen - x->baselen : pathlen;
-		name = pathname + pathlen  - namelen;
-
-		/* if the non-wildcard part is longer than the
-		   remaining pathname, surely it cannot match */
-		if (prefix > namelen)
-			continue;
-
-		if (prefix) {
-			if (strncmp_icase(exclude, name, prefix))
-				continue;
-			exclude += prefix;
-			name    += prefix;
-			namelen -= prefix;
-		}
-
-		if (!fnmatch_icase(exclude, name, FNM_PATHNAME))
+		assert(x->baselen == 0 || x->base[x->baselen - 1] == '/');
+		if (match_pathname(pathname, pathlen,
+				   x->base, x->baselen ? x->baselen - 1 : 0,
+				   exclude, prefix, x->patternlen, x->flags))
 			return to_exclude;
 	}
 	return -1; /* undecided */
diff --git a/dir.h b/dir.h
index d416c5a..11d054a 100644
--- a/dir.h
+++ b/dir.h
@@ -80,6 +80,9 @@ extern int excluded_from_list(const char *pathname, int pathlen, const char *bas
 			      int *dtype, struct exclude_list *el);
 extern int match_basename(const char *, int,
 			  const char *, int, int, int);
+extern int match_pathname(const char *, int,
+			  const char *, int,
+			  const char *, int, int, int);
 struct dir_entry *dir_add_ignored(struct dir_struct *dir, const char *pathname, int len);
 
 /*
-- 
1.8.0.rc2.11.g2b79d01

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/4] exclude/attr: share basename matching code
  2012-10-14 11:55 ` [PATCH 3/4] exclude/attr: share basename matching code Nguyễn Thái Ngọc Duy
@ 2012-10-14 18:09   ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2012-10-14 18:09 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> match_basename's declaration in dir.h does not have any description to
> discourage the use of this function elsewhere as this function is
> highly tied to how excluded_from_list and path_matches work.

If you do want to discourage, please explicitly describe it as such.

I actually think it should have an API description.  The meaning of
its parameters and how you would formulate them is fairly clear and
this is a good example of a simple-and-stupid function that is
designed to do one thing and do it well.

>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  attr.c | 15 ++++-----------
>  dir.c  | 37 ++++++++++++++++++++++++-------------
>  dir.h  |  2 ++
>  3 files changed, 30 insertions(+), 24 deletions(-)
>
> diff --git a/attr.c b/attr.c
> index 0964033..a28ce0d 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -663,17 +663,10 @@ static int path_matches(const char *pathname, int pathlen,
>  	int namelen;
>  
>  	if (pat->flags & EXC_FLAG_NODIR) {
> -		if (prefix == pat->patternlen &&
> -		    !strcmp_icase(pattern, basename))
> -			return 1;
> -
> -		if (pat->flags & EXC_FLAG_ENDSWITH &&
> -		    pat->patternlen - 1 <= pathlen &&
> -		    !strcmp_icase(pattern + 1, pathname +
> -				  pathlen - pat->patternlen + 1))
> -			return 1;
> -
> -		return (fnmatch_icase(pattern, basename, 0) == 0);
> +		return match_basename(basename,
> +				      pathlen - (basename - pathname),
> +				      pattern, prefix,
> +				      pat->patternlen, pat->flags);
>  	}
>  	/*
>  	 * match with FNM_PATHNAME; the pattern has base implicitly
> diff --git a/dir.c b/dir.c
> index 0f4aea6..42c42cd 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -530,6 +530,25 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
>  	dir->basebuf[baselen] = '\0';
>  }
>  
> +int match_basename(const char *basename, int basenamelen,
> +		   const char *pattern, int prefix, int patternlen,
> +		   int flags)
> +{
> +	if (prefix == patternlen) {
> +		if (!strcmp_icase(pattern, basename))
> +			return 1;
> +	} else if (flags & EXC_FLAG_ENDSWITH) {
> +		if (patternlen - 1 <= basenamelen &&
> +		    !strcmp_icase(pattern + 1,
> +				  basename + basenamelen - patternlen + 1))
> +			return 1;
> +	} else {
> +		if (fnmatch_icase(pattern, basename, 0) == 0)
> +			return 1;
> +	}
> +	return 0;
> +}
> +
>  /* Scan the list and let the last match determine the fate.
>   * Return 1 for exclude, 0 for include and -1 for undecided.
>   */
> @@ -556,19 +575,11 @@ int excluded_from_list(const char *pathname,
>  		}
>  
>  		if (x->flags & EXC_FLAG_NODIR) {
> -			/* match basename */
> -			if (prefix == x->patternlen) {
> -				if (!strcmp_icase(exclude, basename))
> -					return to_exclude;
> -			} else if (x->flags & EXC_FLAG_ENDSWITH) {
> -				int len = pathlen - (basename - pathname);
> -				if (x->patternlen - 1 <= len &&
> -				    !strcmp_icase(exclude + 1, basename + len - x->patternlen + 1))
> -					return to_exclude;
> -			} else {
> -				if (fnmatch_icase(exclude, basename, 0) == 0)
> -					return to_exclude;
> -			}
> +			if (match_basename(basename,
> +					   pathlen - (basename - pathname),
> +					   exclude, prefix, x->patternlen,
> +					   x->flags))
> +				return to_exclude;
>  			continue;
>  		}
>  
> diff --git a/dir.h b/dir.h
> index fd5c2aa..d416c5a 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -78,6 +78,8 @@ extern int read_directory(struct dir_struct *, const char *path, int len, const
>  
>  extern int excluded_from_list(const char *pathname, int pathlen, const char *basename,
>  			      int *dtype, struct exclude_list *el);
> +extern int match_basename(const char *, int,
> +			  const char *, int, int, int);
>  struct dir_entry *dir_add_ignored(struct dir_struct *dir, const char *pathname, int len);
>  
>  /*

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-10-14 18:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-14 11:55 [PATCH 0/4] nd/attr-match-optim-more updates Nguyễn Thái Ngọc Duy
2012-10-14 11:55 ` [PATCH 1/4] exclude: stricten a length check in EXC_FLAG_ENDSWITH case Nguyễn Thái Ngọc Duy
2012-10-14 11:55 ` [PATCH 2/4] exclude: fix a bug in prefix compare optimization Nguyễn Thái Ngọc Duy
2012-10-14 11:55 ` [PATCH 3/4] exclude/attr: share basename matching code Nguyễn Thái Ngọc Duy
2012-10-14 18:09   ` Junio C Hamano
2012-10-14 11:55 ` [PATCH 4/4] exclude/attr: share full pathname " Nguyễn Thái Ngọc Duy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).