git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] add core.pathspecGlob config option
@ 2012-12-19 20:34 Jeff King
  2012-12-19 20:54 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Jeff King @ 2012-12-19 20:34 UTC (permalink / raw)
  To: git

Git takes pathspec arguments in many places to limit the
scope of an operation. These pathspecs are treated not as
literal paths, but as glob patterns that can be fed to
fnmatch. When a user is giving a specific pattern, this is a
nice feature.

However, when programatically providing pathspecs, it can be
a nuisance. For example, to find the latest revision which
modified "$foo", one can use "git rev-list -- $foo". But if
"$foo" contains glob characters (e.g., "f*"), it will
erroneously match more entries than desired. The caller
needs to quote the characters in $foo, and even then, the
results may not be exactly the same as with a literal
pathspec. For instance, the depth checks in
match_pathspec_depth do not kick in if we match via fnmatch.

This patch introduces a config option to turn all pathspecs
into literal strings. This makes it easy to turn off the
globbing behavior for a whole environment (e.g., if you are
serving repos via a web interface that is only going to
use literal programmatic pathspecs), or for a particular run
(by using "git -c" to set the option for this run).

It cannot turn off globbing for particular pathspecs. That
could eventually be done with a ":(noglob)" magic pathspec
prefix. However, that level of granularity is not often
needed, and doing ":(noglob)" right would mean converting
the whole codebase to use "struct pathspec", as the usual
"const char **pathspec" cannot represent extra per-item
flags.

Signed-off-by: Jeff King <peff@peff.net>
---
Part of me thinks this is just gross, because ":(noglob)" is the right
solution. But after spending a few hours trying it this morning, there
is a ton of refactoring required to make it work correctly everywhere
(although we could die() if we see it in places that are not using
init_pathspec, so at least we could say "not supported yet" and not just
silently ignore it).

Still, this is so easy to do, and the lack of granularity does not hurt
my use cases at all (which, if you have not guessed, are improving
corner cases in scripted calls on GitHub). And I do not think it is just
me, or just GitHub. Any server-side porcelain would be better off with
such an option (for example, I think gitweb has the same issues; it is
just that they are pretty rare, because most people do not put glob
metacharacters into their filenames).

So I think this is a nice, simple approach for sites that want it, and
noglob magic can come later (and will not be any harder to implement as
a result of this patch).

 cache.h                    |  1 +
 config.c                   |  5 +++++
 dir.c                      | 12 +++++-----
 environment.c              |  1 +
 t/t6130-pathspec-noglob.sh | 56 ++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 70 insertions(+), 5 deletions(-)
 create mode 100755 t/t6130-pathspec-noglob.sh

diff --git a/cache.h b/cache.h
index 18fdd18..0725a33 100644
--- a/cache.h
+++ b/cache.h
@@ -555,6 +555,7 @@ extern int precomposed_unicode;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
 extern int precomposed_unicode;
+extern int core_pathspec_glob;
 
 enum branch_track {
 	BRANCH_TRACK_UNSPECIFIED = -1,
diff --git a/config.c b/config.c
index fb3f868..8a96ba7 100644
--- a/config.c
+++ b/config.c
@@ -760,6 +760,11 @@ static int git_default_core_config(const char *var, const char *value)
 		return 0;
 	}
 
+	if (!strcmp(var, "core.pathspecglob")) {
+		core_pathspec_glob = git_config_bool(var, value);
+		return 0;
+	}
+
 	/* Add other config variables here and to Documentation/config.txt. */
 	return 0;
 }
diff --git a/dir.c b/dir.c
index 5a83aa7..6e81d4f 100644
--- a/dir.c
+++ b/dir.c
@@ -126,7 +126,7 @@ static int match_one(const char *match, const char *name, int namelen)
 		for (;;) {
 			unsigned char c1 = tolower(*match);
 			unsigned char c2 = tolower(*name);
-			if (c1 == '\0' || is_glob_special(c1))
+			if (c1 == '\0' || (core_pathspec_glob && is_glob_special(c1)))
 				break;
 			if (c1 != c2)
 				return 0;
@@ -138,7 +138,7 @@ static int match_one(const char *match, const char *name, int namelen)
 		for (;;) {
 			unsigned char c1 = *match;
 			unsigned char c2 = *name;
-			if (c1 == '\0' || is_glob_special(c1))
+			if (c1 == '\0' || (core_pathspec_glob && is_glob_special(c1)))
 				break;
 			if (c1 != c2)
 				return 0;
@@ -148,14 +148,16 @@ static int match_one(const char *match, const char *name, int namelen)
 		}
 	}
 
-
 	/*
 	 * If we don't match the matchstring exactly,
 	 * we need to match by fnmatch
 	 */
 	matchlen = strlen(match);
-	if (strncmp_icase(match, name, matchlen))
+	if (strncmp_icase(match, name, matchlen)) {
+		if (!core_pathspec_glob)
+			return 0;
 		return !fnmatch_icase(match, name, 0) ? MATCHED_FNMATCH : 0;
+	}
 
 	if (namelen == matchlen)
 		return MATCHED_EXACTLY;
@@ -1429,7 +1431,7 @@ int init_pathspec(struct pathspec *pathspec, const char **paths)
 
 		item->match = path;
 		item->len = strlen(path);
-		item->use_wildcard = !no_wildcard(path);
+		item->use_wildcard = core_pathspec_glob && !no_wildcard(path);
 		if (item->use_wildcard)
 			pathspec->has_wildcard = 1;
 	}
diff --git a/environment.c b/environment.c
index 85edd7f..d0d30a0 100644
--- a/environment.c
+++ b/environment.c
@@ -59,6 +59,7 @@ int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
 int core_apply_sparse_checkout;
 int merge_log_config = -1;
 int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
+int core_pathspec_glob = 1;
 struct startup_info *startup_info;
 unsigned long pack_size_limit_cfg;
 
diff --git a/t/t6130-pathspec-noglob.sh b/t/t6130-pathspec-noglob.sh
new file mode 100755
index 0000000..bb962ac
--- /dev/null
+++ b/t/t6130-pathspec-noglob.sh
@@ -0,0 +1,56 @@
+#!/bin/sh
+
+test_description='test globbing (and noglob) of pathspec limiting'
+. ./test-lib.sh
+
+test_expect_success 'create commits with glob characters' '
+	test_commit unrelated bar &&
+	test_commit vanilla foo &&
+	test_commit star "f*" &&
+	test_commit bracket "f[o][o]"
+'
+
+test_expect_success 'vanilla pathspec matches literally' '
+	echo vanilla >expect &&
+	git log --format=%s -- foo >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'star pathspec globs' '
+	cat >expect <<-\EOF &&
+	bracket
+	star
+	vanilla
+	EOF
+	git log --format=%s -- "f*" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'bracket pathspec globs and matches literal brackets' '
+	cat >expect <<-\EOF &&
+	bracket
+	vanilla
+	EOF
+	git log --format=%s -- "f[o][o]" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob config matches literally (vanilla)' '
+	echo vanilla >expect &&
+	git -c core.pathspecglob=false log --format=%s -- foo >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob config matches literally (star)' '
+	echo star >expect &&
+	git -c core.pathspecglob=false log --format=%s -- "f*" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob config matches literally (bracket)' '
+	echo bracket >expect &&
+	git -c core.pathspecglob=false log --format=%s -- "f[o][o]" >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
1.8.1.rc2.24.gf3bad77

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-19 20:34 [PATCH] add core.pathspecGlob config option Jeff King
@ 2012-12-19 20:54 ` Junio C Hamano
  2012-12-19 21:09   ` Jeff King
  2012-12-19 21:30 ` [PATCH] add core.pathspecGlob config option Junio C Hamano
  2012-12-20  1:28 ` Nguyen Thai Ngoc Duy
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2012-12-19 20:54 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> ... doing ":(noglob)" right would mean converting
> the whole codebase to use "struct pathspec", as the usual
> "const char **pathspec" cannot represent extra per-item
> flags.

As that is the longer-term direction we would want to go, I'd rather
not to take the approach in this patch for introducing user-facing
support of literal pathspecs.

Having said that, even when we have the ':(noglob)' magic pathspec
support, it would make sense to introduce a command line option to
make it easier for scripted Porcelains that call plumbing commands
to pass literal pathspecs (i.e. they know exactly what paths they
want to operate, not what patterns the paths they want to operate
would match).  I do not think configuration variable makes much
sense (unless you are thinking "git -c core.literalpathspec=true"
as that command line option).

Thanks

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-19 20:54 ` Junio C Hamano
@ 2012-12-19 21:09   ` Jeff King
  2012-12-19 21:50     ` [PATCH] add GIT_PATHSPEC_GLOB environment variable Jeff King
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff King @ 2012-12-19 21:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Dec 19, 2012 at 12:54:04PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > ... doing ":(noglob)" right would mean converting
> > the whole codebase to use "struct pathspec", as the usual
> > "const char **pathspec" cannot represent extra per-item
> > flags.
> 
> As that is the longer-term direction we would want to go, I'd rather
> not to take the approach in this patch for introducing user-facing
> support of literal pathspecs.
> 
> Having said that, even when we have the ':(noglob)' magic pathspec
> support, it would make sense to introduce a command line option to
> make it easier for scripted Porcelains that call plumbing commands
> to pass literal pathspecs (i.e. they know exactly what paths they
> want to operate, not what patterns the paths they want to operate
> would match).

Right, that is my use case. Even once ":(noglob)" exists, it will still
be much simpler for me to use a single option, since I would never have
mixed patterns and literal paths (and I suspect most use cases that
would care about the distinction would be in the same boat). And that is
what led me to submit this at all, as it is not just "well, it is a hack
until :(noglob) works", but "this is not as fine a granularity as
:(noglob), but it better matches what callers want".

> I do not think configuration variable makes much sense (unless you are
> thinking "git -c core.literalpathspec=true" as that command line
> option).

The configuration variable is much more convenient for me, because
otherwise I have to instrument every call to git with a "--no-glob"
option. Because I have a very restricted environment, and happen to know
that globs will _never_ be useful on my repos (or, say, on commands run
by a particular user who has this in their ~/.gitconfig), it makes sense
to just turn off the feature with one switch.

And my thinking was that for people who are not in such a restricted
environment, "git -c core.pathspecglob=false" could serve as that
command-line option, as you mentioned.

It's perhaps a better match to make it an environment variable. Then it
is tied to a particular flow of execution, rather than having it be a
property of a system, user, or repo (which is what config does). So for
my restricted environment, it would be sufficient for me to set the
environment variable for the user who runs the scripts. And people who
want a command-line option can set it via the shell, or we can provide
"git --no-pathspec-glob" to set it.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-19 20:34 [PATCH] add core.pathspecGlob config option Jeff King
  2012-12-19 20:54 ` Junio C Hamano
@ 2012-12-19 21:30 ` Junio C Hamano
  2012-12-19 21:45   ` Jeff King
  2012-12-20  1:28 ` Nguyen Thai Ngoc Duy
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2012-12-19 21:30 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> diff --git a/dir.c b/dir.c
> index 5a83aa7..6e81d4f 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -126,7 +126,7 @@ static int match_one(const char *match, const char *name, int namelen)
>  		for (;;) {
>  			unsigned char c1 = tolower(*match);
>  			unsigned char c2 = tolower(*name);
> -			if (c1 == '\0' || is_glob_special(c1))
> +			if (c1 == '\0' || (core_pathspec_glob && is_glob_special(c1)))
>  				break;
>  			if (c1 != c2)
>  				return 0;

I think you can also do the same to the common_prefix(); we check
for common leading directory prefix but punt upon seeing a directory
component that has a glob character, and under the "literal" mode,
it is not a special character.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-19 21:30 ` [PATCH] add core.pathspecGlob config option Junio C Hamano
@ 2012-12-19 21:45   ` Jeff King
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff King @ 2012-12-19 21:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Dec 19, 2012 at 01:30:57PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > diff --git a/dir.c b/dir.c
> > index 5a83aa7..6e81d4f 100644
> > --- a/dir.c
> > +++ b/dir.c
> > @@ -126,7 +126,7 @@ static int match_one(const char *match, const char *name, int namelen)
> >  		for (;;) {
> >  			unsigned char c1 = tolower(*match);
> >  			unsigned char c2 = tolower(*name);
> > -			if (c1 == '\0' || is_glob_special(c1))
> > +			if (c1 == '\0' || (core_pathspec_glob && is_glob_special(c1)))
> >  				break;
> >  			if (c1 != c2)
> >  				return 0;
> 
> I think you can also do the same to the common_prefix(); we check
> for common leading directory prefix but punt upon seeing a directory
> component that has a glob character, and under the "literal" mode,
> it is not a special character.

Yeah, I think you're right. Will add to my re-roll.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] add GIT_PATHSPEC_GLOB environment variable
  2012-12-19 21:09   ` Jeff King
@ 2012-12-19 21:50     ` Jeff King
  2012-12-19 22:00       ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff King @ 2012-12-19 21:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Dec 19, 2012 at 04:09:19PM -0500, Jeff King wrote:

> It's perhaps a better match to make it an environment variable. Then it
> is tied to a particular flow of execution, rather than having it be a
> property of a system, user, or repo (which is what config does). So for
> my restricted environment, it would be sufficient for me to set the
> environment variable for the user who runs the scripts. And people who
> want a command-line option can set it via the shell, or we can provide
> "git --no-pathspec-glob" to set it.

Here it is as an environment variable. I think this probably makes more
sense in the general case (it's a little more work for my use case, but
I think the intended use is much more obvious).

I included the common_prefix fix you mentioned (I do not think it
produced incorrect results as it was, but it did not take full advantage
of an optimization).

-- >8 --
Subject: add GIT_PATHSPEC_GLOB environment variable

Git takes pathspec arguments in many places to limit the
scope of an operation. These pathspecs are treated not as
literal paths, but as glob patterns that can be fed to
fnmatch. When a user is giving a specific pattern, this is a
nice feature.

However, when programatically providing pathspecs, it can be
a nuisance. For example, to find the latest revision which
modified "$foo", one can use "git rev-list -- $foo". But if
"$foo" contains glob characters (e.g., "f*"), it will
erroneously match more entries than desired. The caller
needs to quote the characters in $foo, and even then, the
results may not be exactly the same as with a literal
pathspec. For instance, the depth checks in
match_pathspec_depth do not kick in if we match via fnmatch.

This patch introduces an environment variable to turn all
pathspecs into literal strings. This makes it easy to turn
off the globbing behavior for a whole environment (e.g., if
you are serving repos via a web interface that is only going
to use literal programmatic pathspecs), or for a particular
run.

It cannot turn off globbing for particular pathspecs. That
could eventually be done with a ":(noglob)" magic pathspec
prefix. However, that level of granularity is not often
needed, and doing ":(noglob)" right would mean converting
the whole codebase to use "struct pathspec", as the usual
"const char **pathspec" cannot represent extra per-item
flags.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git.txt      | 15 +++++++++++
 cache.h                    |  3 +++
 dir.c                      | 24 +++++++++++++-----
 git.c                      |  8 ++++++
 t/t6130-pathspec-noglob.sh | 62 ++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 106 insertions(+), 6 deletions(-)
 create mode 100755 t/t6130-pathspec-noglob.sh

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 1d797f2..7008b4d 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -422,6 +422,10 @@ help ...`.
 	Do not use replacement refs to replace git objects. See
 	linkgit:git-replace[1] for more information.
 
+--no-pathspec-glob::
+	Do not treat pathspecs as glob patterns. See the section on
+	the `GIT_PATHSPEC_GLOB` environment variable below.
+
 
 GIT COMMANDS
 ------------
@@ -790,6 +794,17 @@ for further details.
 	as a file path and will try to write the trace messages
 	into it.
 
+GIT_PATHSPEC_GLOB::
+	Setting this variable to `0` will turn off the globbing features
+	of git's pathspecs. For example, running `GIT_PATHSPEC_GLOB=0 git
+	log -- '*.c'` will search for commits literally touching the
+	path `*.c`, not any paths that the glob `*.c` matches. You might
+	want this if you are feeding literal paths to git (e.g., paths
+	previously given to you by `git ls-tree`, `--raw` diff output,
+	etc). Setting it to `1` enables pathspec globbing (which is also
+	the default).
+
+
 Discussion[[Discussion]]
 ------------------------
 
diff --git a/cache.h b/cache.h
index 18fdd18..98af77c 100644
--- a/cache.h
+++ b/cache.h
@@ -362,6 +362,7 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF"
 #define GIT_NOTES_REWRITE_REF_ENVIRONMENT "GIT_NOTES_REWRITE_REF"
 #define GIT_NOTES_REWRITE_MODE_ENVIRONMENT "GIT_NOTES_REWRITE_MODE"
+#define GIT_PATHSPEC_GLOB_ENVIRONMENT "GIT_PATHSPEC_GLOB"
 
 /*
  * Repository-local GIT_* environment variables
@@ -490,6 +491,8 @@ extern int ce_path_match(const struct cache_entry *ce, const struct pathspec *pa
 extern void free_pathspec(struct pathspec *);
 extern int ce_path_match(const struct cache_entry *ce, const struct pathspec *pathspec);
 
+extern int allow_pathspec_glob(void);
+
 #define HASH_WRITE_OBJECT 1
 #define HASH_FORMAT_CHECK 2
 extern int index_fd(unsigned char *sha1, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
diff --git a/dir.c b/dir.c
index 5a83aa7..a4d4321 100644
--- a/dir.c
+++ b/dir.c
@@ -38,6 +38,7 @@ static size_t common_prefix_len(const char **pathspec)
 {
 	const char *n, *first;
 	size_t max = 0;
+	int glob = allow_pathspec_glob();
 
 	if (!pathspec)
 		return max;
@@ -47,7 +48,7 @@ static size_t common_prefix_len(const char **pathspec)
 		size_t i, len = 0;
 		for (i = 0; first == n || i < max; i++) {
 			char c = n[i];
-			if (!c || c != first[i] || is_glob_special(c))
+			if (!c || c != first[i] || (glob && is_glob_special(c)))
 				break;
 			if (c == '/')
 				len = i + 1;
@@ -117,6 +118,7 @@ static int match_one(const char *match, const char *name, int namelen)
 static int match_one(const char *match, const char *name, int namelen)
 {
 	int matchlen;
+	int glob = allow_pathspec_glob();
 
 	/* If the match was just the prefix, we matched */
 	if (!*match)
@@ -126,7 +128,7 @@ static int match_one(const char *match, const char *name, int namelen)
 		for (;;) {
 			unsigned char c1 = tolower(*match);
 			unsigned char c2 = tolower(*name);
-			if (c1 == '\0' || is_glob_special(c1))
+			if (c1 == '\0' || (glob && is_glob_special(c1)))
 				break;
 			if (c1 != c2)
 				return 0;
@@ -138,7 +140,7 @@ static int match_one(const char *match, const char *name, int namelen)
 		for (;;) {
 			unsigned char c1 = *match;
 			unsigned char c2 = *name;
-			if (c1 == '\0' || is_glob_special(c1))
+			if (c1 == '\0' || (glob && is_glob_special(c1)))
 				break;
 			if (c1 != c2)
 				return 0;
@@ -148,14 +150,16 @@ static int match_one(const char *match, const char *name, int namelen)
 		}
 	}
 
-
 	/*
 	 * If we don't match the matchstring exactly,
 	 * we need to match by fnmatch
 	 */
 	matchlen = strlen(match);
-	if (strncmp_icase(match, name, matchlen))
+	if (strncmp_icase(match, name, matchlen)) {
+		if (!glob)
+			return 0;
 		return !fnmatch_icase(match, name, 0) ? MATCHED_FNMATCH : 0;
+	}
 
 	if (namelen == matchlen)
 		return MATCHED_EXACTLY;
@@ -1429,7 +1433,7 @@ int init_pathspec(struct pathspec *pathspec, const char **paths)
 
 		item->match = path;
 		item->len = strlen(path);
-		item->use_wildcard = !no_wildcard(path);
+		item->use_wildcard = allow_pathspec_glob() && !no_wildcard(path);
 		if (item->use_wildcard)
 			pathspec->has_wildcard = 1;
 	}
@@ -1445,3 +1449,11 @@ void free_pathspec(struct pathspec *pathspec)
 	free(pathspec->items);
 	pathspec->items = NULL;
 }
+
+int allow_pathspec_glob(void)
+{
+	static int flag = -1;
+	if (flag < 0)
+		flag = git_env_bool(GIT_PATHSPEC_GLOB_ENVIRONMENT, 1);
+	return flag;
+}
diff --git a/git.c b/git.c
index d33f9b3..db0496f 100644
--- a/git.c
+++ b/git.c
@@ -135,6 +135,14 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 			git_config_push_parameter((*argv)[1]);
 			(*argv)++;
 			(*argc)--;
+		} else if (!strcmp(cmd, "--pathspec-glob")) {
+			setenv(GIT_PATHSPEC_GLOB_ENVIRONMENT, "1", 1);
+			if (envchanged)
+				*envchanged = 1;
+		} else if (!strcmp(cmd, "--no-pathspec-glob")) {
+			setenv(GIT_PATHSPEC_GLOB_ENVIRONMENT, "0", 1);
+			if (envchanged)
+				*envchanged = 1;
 		} else {
 			fprintf(stderr, "Unknown option: %s\n", cmd);
 			usage(git_usage_string);
diff --git a/t/t6130-pathspec-noglob.sh b/t/t6130-pathspec-noglob.sh
new file mode 100755
index 0000000..835e730
--- /dev/null
+++ b/t/t6130-pathspec-noglob.sh
@@ -0,0 +1,62 @@
+#!/bin/sh
+
+test_description='test globbing (and noglob) of pathspec limiting'
+. ./test-lib.sh
+
+test_expect_success 'create commits with glob characters' '
+	test_commit unrelated bar &&
+	test_commit vanilla foo &&
+	test_commit star "f*" &&
+	test_commit bracket "f[o][o]"
+'
+
+test_expect_success 'vanilla pathspec matches literally' '
+	echo vanilla >expect &&
+	git log --format=%s -- foo >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'star pathspec globs' '
+	cat >expect <<-\EOF &&
+	bracket
+	star
+	vanilla
+	EOF
+	git log --format=%s -- "f*" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'bracket pathspec globs and matches literal brackets' '
+	cat >expect <<-\EOF &&
+	bracket
+	vanilla
+	EOF
+	git log --format=%s -- "f[o][o]" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob option matches literally (vanilla)' '
+	echo vanilla >expect &&
+	git --no-pathspec-glob log --format=%s -- foo >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob option matches literally (star)' '
+	echo star >expect &&
+	git --no-pathspec-glob log --format=%s -- "f*" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob option matches literally (bracket)' '
+	echo bracket >expect &&
+	git --no-pathspec-glob log --format=%s -- "f[o][o]" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob environment variable works' '
+	echo star >expect &&
+	GIT_PATHSPEC_GLOB=0 git log --format=%s -- "f*" >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
1.8.1.rc2.24.gf3bad77

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] add GIT_PATHSPEC_GLOB environment variable
  2012-12-19 21:50     ` [PATCH] add GIT_PATHSPEC_GLOB environment variable Jeff King
@ 2012-12-19 22:00       ` Junio C Hamano
  2012-12-19 22:12         ` Jeff King
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2012-12-19 22:00 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> I included the common_prefix fix you mentioned (I do not think it
> produced incorrect results as it was, but it did not take full advantage
> of an optimization).

I do not think it would have affected the outcome; you would only
have worked with more cycles.

> Subject: add GIT_PATHSPEC_GLOB environment variable

Seems cleanly done from a quick look.

Given that the normal mode of operation is to use globbing, I
suspect that the names would have been more natural if the toggle
were GIT_PATHSPEC_LITERAL and the boolean function were
limit_pathspec_to_literal(), instead of "allow_pathspec_glob()",
sounding as if using glob is done only upon request.

But that is a minor issue.

> This patch introduces an environment variable to turn all
> pathspecs into literal strings. This makes it easy to turn
> off the globbing behavior for a whole environment (e.g., if
> you are serving repos via a web interface that is only going
> to use literal programmatic pathspecs), or for a particular
> run.

I am not sure if "web interface" is a particularly good example,
though.  Is it unusual to imagine a Web UI that takes pathspecs from
the user to limit its output (e.g. "diff" or "ls-tree") to those
paths that match them?  In such a case, the user would expect their
pathspecs to work the same way as the Git installed on their
desktop, I would think.

Will queue; thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add GIT_PATHSPEC_GLOB environment variable
  2012-12-19 22:00       ` Junio C Hamano
@ 2012-12-19 22:12         ` Jeff King
  2012-12-19 22:16           ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff King @ 2012-12-19 22:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Dec 19, 2012 at 02:00:03PM -0800, Junio C Hamano wrote:

> > Subject: add GIT_PATHSPEC_GLOB environment variable
> 
> Seems cleanly done from a quick look.
> 
> Given that the normal mode of operation is to use globbing, I
> suspect that the names would have been more natural if the toggle
> were GIT_PATHSPEC_LITERAL and the boolean function were
> limit_pathspec_to_literal(), instead of "allow_pathspec_glob()",
> sounding as if using glob is done only upon request.
> 
> But that is a minor issue.

Yeah, I was trying to avoid the double-negation of "noglob=false" for
the default behavior. I guess calling it literal is another way of
accomplishing that, and it keeps the default at "false". I don't have a
strong preference.

> > This patch introduces an environment variable to turn all
> > pathspecs into literal strings. This makes it easy to turn
> > off the globbing behavior for a whole environment (e.g., if
> > you are serving repos via a web interface that is only going
> > to use literal programmatic pathspecs), or for a particular
> > run.
> 
> I am not sure if "web interface" is a particularly good example,
> though.  Is it unusual to imagine a Web UI that takes pathspecs from
> the user to limit its output (e.g. "diff" or "ls-tree") to those
> paths that match them?  In such a case, the user would expect their
> pathspecs to work the same way as the Git installed on their
> desktop, I would think.

Yes. If you want to provide for user-provided globbing pathspecs, then
you'd have to annotate each invocation with whether you want globbing or
not. What I was trying to illustrate was more how "gitweb" will let you
click on the "history" link for "foo" in a tree listing, resulting in a
page that is generated by calling "git rev-list foo". You would probably
not want pathspec globbing there.

> Will queue; thanks.

Do we want to change the variable name and invert the logic? Now is
probably the best time to do it.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add GIT_PATHSPEC_GLOB environment variable
  2012-12-19 22:12         ` Jeff King
@ 2012-12-19 22:16           ` Junio C Hamano
  2012-12-19 22:20             ` Jeff King
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2012-12-19 22:16 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

>> Will queue; thanks.
>
> Do we want to change the variable name and invert the logic?

That would be my preference.

I am deep into today's integration cycle, and this PATHSPEC_GLOB
version is sitting at the tip of 'pu', so today's pushout will
contain that version, though.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add GIT_PATHSPEC_GLOB environment variable
  2012-12-19 22:16           ` Junio C Hamano
@ 2012-12-19 22:20             ` Jeff King
  2012-12-19 22:37               ` Jeff King
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff King @ 2012-12-19 22:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Dec 19, 2012 at 02:16:52PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> Will queue; thanks.
> >
> > Do we want to change the variable name and invert the logic?
> 
> That would be my preference.
> 
> I am deep into today's integration cycle, and this PATHSPEC_GLOB
> version is sitting at the tip of 'pu', so today's pushout will
> contain that version, though.

That's fine. I'll send out a revised version, and you can pick it up
later.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add GIT_PATHSPEC_GLOB environment variable
  2012-12-19 22:20             ` Jeff King
@ 2012-12-19 22:37               ` Jeff King
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff King @ 2012-12-19 22:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Dec 19, 2012 at 05:20:35PM -0500, Jeff King wrote:

> > > Do we want to change the variable name and invert the logic?
> > 
> > That would be my preference.
> > [...]
> That's fine. I'll send out a revised version, and you can pick it up
> later.

Here it is.

-- >8 --
Subject: [PATCH] add global --literal-pathspecs option

Git takes pathspec arguments in many places to limit the
scope of an operation. These pathspecs are treated not as
literal paths, but as glob patterns that can be fed to
fnmatch. When a user is giving a specific pattern, this is a
nice feature.

However, when programatically providing pathspecs, it can be
a nuisance. For example, to find the latest revision which
modified "$foo", one can use "git rev-list -- $foo". But if
"$foo" contains glob characters (e.g., "f*"), it will
erroneously match more entries than desired. The caller
needs to quote the characters in $foo, and even then, the
results may not be exactly the same as with a literal
pathspec. For instance, the depth checks in
match_pathspec_depth do not kick in if we match via fnmatch.

This patch introduces a global command-line option (i.e.,
one for "git" itself, not for specific commands) to turn
this behavior off. It also has a matching environment
variable, which can make it easier if you are a script or
porcelain interface that is going to issue many such
commands.

This option cannot turn off globbing for particular
pathspecs. That could eventually be done with a ":(noglob)"
magic pathspec prefix. However, that level of granularity is
more cumbersome to use for many cases, and doing ":(noglob)"
right would mean converting the whole codebase to use
"struct pathspec", as the usual "const char **pathspec"
cannot represent extra per-item flags.

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git.txt      | 15 +++++++++++
 cache.h                    |  3 +++
 dir.c                      | 25 ++++++++++++++-----
 git.c                      |  8 ++++++
 t/t6130-pathspec-noglob.sh | 62 ++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 107 insertions(+), 6 deletions(-)
 create mode 100755 t/t6130-pathspec-noglob.sh

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 1d797f2..8d869db 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -422,6 +422,11 @@ help ...`.
 	Do not use replacement refs to replace git objects. See
 	linkgit:git-replace[1] for more information.
 
+--literal-pathspecs::
+	Treat pathspecs literally, rather than as glob patterns. This is
+	equivalent to setting the `GIT_LITERAL_PATHSPECS` environment
+	variable to `1`.
+
 
 GIT COMMANDS
 ------------
@@ -790,6 +795,16 @@ for further details.
 	as a file path and will try to write the trace messages
 	into it.
 
+GIT_LITERAL_PATHSPECS::
+	Setting this variable to `1` will cause git to treat all
+	pathspecs literally, rather than as glob patterns. For example,
+	running `GIT_LITERAL_PATHSPECS=1 git log -- '*.c'` will search
+	for commits that touch the path `*.c`, not any paths that the
+	glob `*.c` matches. You might want this if you are feeding
+	literal paths to git (e.g., paths previously given to you by
+	`git ls-tree`, `--raw` diff output, etc).
+
+
 Discussion[[Discussion]]
 ------------------------
 
diff --git a/cache.h b/cache.h
index 18fdd18..95608d9 100644
--- a/cache.h
+++ b/cache.h
@@ -362,6 +362,7 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF"
 #define GIT_NOTES_REWRITE_REF_ENVIRONMENT "GIT_NOTES_REWRITE_REF"
 #define GIT_NOTES_REWRITE_MODE_ENVIRONMENT "GIT_NOTES_REWRITE_MODE"
+#define GIT_LITERAL_PATHSPECS_ENVIRONMENT "GIT_LITERAL_PATHSPECS"
 
 /*
  * Repository-local GIT_* environment variables
@@ -490,6 +491,8 @@ extern int ce_path_match(const struct cache_entry *ce, const struct pathspec *pa
 extern void free_pathspec(struct pathspec *);
 extern int ce_path_match(const struct cache_entry *ce, const struct pathspec *pathspec);
 
+extern int limit_pathspec_to_literal(void);
+
 #define HASH_WRITE_OBJECT 1
 #define HASH_FORMAT_CHECK 2
 extern int index_fd(unsigned char *sha1, int fd, struct stat *st, enum object_type type, const char *path, unsigned flags);
diff --git a/dir.c b/dir.c
index 5a83aa7..03ff36b 100644
--- a/dir.c
+++ b/dir.c
@@ -38,6 +38,7 @@ static size_t common_prefix_len(const char **pathspec)
 {
 	const char *n, *first;
 	size_t max = 0;
+	int literal = limit_pathspec_to_literal();
 
 	if (!pathspec)
 		return max;
@@ -47,7 +48,7 @@ static size_t common_prefix_len(const char **pathspec)
 		size_t i, len = 0;
 		for (i = 0; first == n || i < max; i++) {
 			char c = n[i];
-			if (!c || c != first[i] || is_glob_special(c))
+			if (!c || c != first[i] || (!literal && is_glob_special(c)))
 				break;
 			if (c == '/')
 				len = i + 1;
@@ -117,6 +118,7 @@ static int match_one(const char *match, const char *name, int namelen)
 static int match_one(const char *match, const char *name, int namelen)
 {
 	int matchlen;
+	int literal = limit_pathspec_to_literal();
 
 	/* If the match was just the prefix, we matched */
 	if (!*match)
@@ -126,7 +128,7 @@ static int match_one(const char *match, const char *name, int namelen)
 		for (;;) {
 			unsigned char c1 = tolower(*match);
 			unsigned char c2 = tolower(*name);
-			if (c1 == '\0' || is_glob_special(c1))
+			if (c1 == '\0' || (!literal && is_glob_special(c1)))
 				break;
 			if (c1 != c2)
 				return 0;
@@ -138,7 +140,7 @@ static int match_one(const char *match, const char *name, int namelen)
 		for (;;) {
 			unsigned char c1 = *match;
 			unsigned char c2 = *name;
-			if (c1 == '\0' || is_glob_special(c1))
+			if (c1 == '\0' || (!literal && is_glob_special(c1)))
 				break;
 			if (c1 != c2)
 				return 0;
@@ -148,14 +150,16 @@ static int match_one(const char *match, const char *name, int namelen)
 		}
 	}
 
-
 	/*
 	 * If we don't match the matchstring exactly,
 	 * we need to match by fnmatch
 	 */
 	matchlen = strlen(match);
-	if (strncmp_icase(match, name, matchlen))
+	if (strncmp_icase(match, name, matchlen)) {
+		if (literal)
+			return 0;
 		return !fnmatch_icase(match, name, 0) ? MATCHED_FNMATCH : 0;
+	}
 
 	if (namelen == matchlen)
 		return MATCHED_EXACTLY;
@@ -1429,7 +1433,8 @@ int init_pathspec(struct pathspec *pathspec, const char **paths)
 
 		item->match = path;
 		item->len = strlen(path);
-		item->use_wildcard = !no_wildcard(path);
+		item->use_wildcard = !limit_pathspec_to_literal() &&
+				     !no_wildcard(path);
 		if (item->use_wildcard)
 			pathspec->has_wildcard = 1;
 	}
@@ -1445,3 +1450,11 @@ void free_pathspec(struct pathspec *pathspec)
 	free(pathspec->items);
 	pathspec->items = NULL;
 }
+
+int limit_pathspec_to_literal(void)
+{
+	static int flag = -1;
+	if (flag < 0)
+		flag = git_env_bool(GIT_LITERAL_PATHSPECS_ENVIRONMENT, 0);
+	return flag;
+}
diff --git a/git.c b/git.c
index d33f9b3..ed66c66 100644
--- a/git.c
+++ b/git.c
@@ -135,6 +135,14 @@ static int handle_options(const char ***argv, int *argc, int *envchanged)
 			git_config_push_parameter((*argv)[1]);
 			(*argv)++;
 			(*argc)--;
+		} else if (!strcmp(cmd, "--literal-pathspecs")) {
+			setenv(GIT_LITERAL_PATHSPECS_ENVIRONMENT, "1", 1);
+			if (envchanged)
+				*envchanged = 1;
+		} else if (!strcmp(cmd, "--no-literal-pathspecs")) {
+			setenv(GIT_LITERAL_PATHSPECS_ENVIRONMENT, "0", 1);
+			if (envchanged)
+				*envchanged = 1;
 		} else {
 			fprintf(stderr, "Unknown option: %s\n", cmd);
 			usage(git_usage_string);
diff --git a/t/t6130-pathspec-noglob.sh b/t/t6130-pathspec-noglob.sh
new file mode 100755
index 0000000..bb5e710
--- /dev/null
+++ b/t/t6130-pathspec-noglob.sh
@@ -0,0 +1,62 @@
+#!/bin/sh
+
+test_description='test globbing (and noglob) of pathspec limiting'
+. ./test-lib.sh
+
+test_expect_success 'create commits with glob characters' '
+	test_commit unrelated bar &&
+	test_commit vanilla foo &&
+	test_commit star "f*" &&
+	test_commit bracket "f[o][o]"
+'
+
+test_expect_success 'vanilla pathspec matches literally' '
+	echo vanilla >expect &&
+	git log --format=%s -- foo >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'star pathspec globs' '
+	cat >expect <<-\EOF &&
+	bracket
+	star
+	vanilla
+	EOF
+	git log --format=%s -- "f*" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'bracket pathspec globs and matches literal brackets' '
+	cat >expect <<-\EOF &&
+	bracket
+	vanilla
+	EOF
+	git log --format=%s -- "f[o][o]" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob option matches literally (vanilla)' '
+	echo vanilla >expect &&
+	git --literal-pathspecs log --format=%s -- foo >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob option matches literally (star)' '
+	echo star >expect &&
+	git --literal-pathspecs log --format=%s -- "f*" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob option matches literally (bracket)' '
+	echo bracket >expect &&
+	git --literal-pathspecs log --format=%s -- "f[o][o]" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'no-glob environment variable works' '
+	echo star >expect &&
+	GIT_LITERAL_PATHSPECS=1 git log --format=%s -- "f*" >actual &&
+	test_cmp expect actual
+'
+
+test_done
-- 
1.8.1.rc2.24.gf3bad77

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-19 20:34 [PATCH] add core.pathspecGlob config option Jeff King
  2012-12-19 20:54 ` Junio C Hamano
  2012-12-19 21:30 ` [PATCH] add core.pathspecGlob config option Junio C Hamano
@ 2012-12-20  1:28 ` Nguyen Thai Ngoc Duy
  2012-12-20  3:13   ` Jeff King
  2 siblings, 1 reply; 18+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-12-20  1:28 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Thu, Dec 20, 2012 at 3:34 AM, Jeff King <peff@peff.net> wrote:
> Part of me thinks this is just gross, because ":(noglob)" is the right
> solution. But after spending a few hours trying it this morning, there
> is a ton of refactoring required to make it work correctly everywhere
> (although we could die() if we see it in places that are not using
> init_pathspec, so at least we could say "not supported yet" and not just
> silently ignore it).

Yep, I'm still half way to converting everything to the new pathspec
code. I'm not there yet. And things like this probably better goes
with a config key or command line option, appending :(noglob) to every
pathspec item is not pleasant. :(icase) may go the same way.

> So I think this is a nice, simple approach for sites that want it, and
> noglob magic can come later (and will not be any harder to implement as
> a result of this patch).

Any chance to make use of nd/pathspec-wildcard? It changes the same
code path in match_one. If you base on top of nd/pathspec-wildcard,
all you have to do is assign nowildcard_len to len (i.e. no wildcard
part).
-- 
Duy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-20  1:28 ` Nguyen Thai Ngoc Duy
@ 2012-12-20  3:13   ` Jeff King
  2012-12-20  3:51     ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff King @ 2012-12-20  3:13 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git

On Thu, Dec 20, 2012 at 08:28:57AM +0700, Nguyen Thai Ngoc Duy wrote:

> > So I think this is a nice, simple approach for sites that want it, and
> > noglob magic can come later (and will not be any harder to implement as
> > a result of this patch).
> 
> Any chance to make use of nd/pathspec-wildcard? It changes the same
> code path in match_one. If you base on top of nd/pathspec-wildcard,
> all you have to do is assign nowildcard_len to len (i.e. no wildcard
> part).

I'd rather keep it separate for now. One, just because they really are
independent topics, and two, because I am actually back-porting it for
GitHub (we are fairly conservative about upgrading our backend git
versions, as most of the interesting stuff happens on the client side; I
cherry-pick critical patches with no regard to the release cycle).

And the resolution is pretty trivial, too. It looks like this:

diff --cc dir.c
index 5c0e5f6,03ff36b..81cb439
--- a/dir.c
+++ b/dir.c
@@@ -1456,14 -1433,10 +1460,18 @@@ int init_pathspec(struct pathspec *path
  
  		item->match = path;
  		item->len = strlen(path);
- 		item->nowildcard_len = simple_length(path);
 -		item->use_wildcard = !limit_pathspec_to_literal() &&
 -				     !no_wildcard(path);
 -		if (item->use_wildcard)
 -			pathspec->has_wildcard = 1;
 +		item->flags = 0;
- 		if (item->nowildcard_len < item->len) {
- 			pathspec->has_wildcard = 1;
- 			if (path[item->nowildcard_len] == '*' &&
- 			    no_wildcard(path + item->nowildcard_len + 1))
- 				item->flags |= PATHSPEC_ONESTAR;
++		if (limit_pathspec_to_literal())
++			item->nowildcard_len = item->len;
++		else {
++			item->nowildcard_len = simple_length(path);
++			if (item->nowildcard_len < item->len) {
++				pathspec->has_wildcard = 1;
++				if (path[item->nowildcard_len] == '*' &&
++				    no_wildcard(path + item->nowildcard_len + 1))
++					item->flags |= PATHSPEC_ONESTAR;
++			}
 +		}
  	}
  
  	qsort(pathspec->items, pathspec->nr,

Not re-indenting the conditional would make the diff more readable, but
I think the resulting code is simpler to read if all of the wildcard
stuff is inside the "else" block.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-20  3:13   ` Jeff King
@ 2012-12-20  3:51     ` Junio C Hamano
  2012-12-20  3:55       ` Jeff King
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2012-12-20  3:51 UTC (permalink / raw)
  To: Jeff King; +Cc: Nguyen Thai Ngoc Duy, git

Jeff King <peff@peff.net> writes:

> On Thu, Dec 20, 2012 at 08:28:57AM +0700, Nguyen Thai Ngoc Duy wrote:
>
>> > So I think this is a nice, simple approach for sites that want it, and
>> > noglob magic can come later (and will not be any harder to implement as
>> > a result of this patch).
>> 
>> Any chance to make use of nd/pathspec-wildcard? It changes the same
>> code path in match_one. If you base on top of nd/pathspec-wildcard,
>> all you have to do is assign nowildcard_len to len (i.e. no wildcard
>> part).
>
> I'd rather keep it separate for now. One, just because they really are
> independent topics, and two, because I am actually back-porting it for
> GitHub (we are fairly conservative about upgrading our backend git
> versions, as most of the interesting stuff happens on the client side; I
> cherry-pick critical patches with no regard to the release cycle).
>
> And the resolution is pretty trivial, too. It looks like this:
>
> diff --cc dir.c
> index 5c0e5f6,03ff36b..81cb439
> --- a/dir.c
> +++ b/dir.c
> @@@ -1456,14 -1433,10 +1460,18 @@@ int init_pathspec(struct pathspec *path
>   
>   		item->match = path;
>   		item->len = strlen(path);
> - 		item->nowildcard_len = simple_length(path);
>  -		item->use_wildcard = !limit_pathspec_to_literal() &&
>  -				     !no_wildcard(path);
>  -		if (item->use_wildcard)
>  -			pathspec->has_wildcard = 1;
>  +		item->flags = 0;
> - 		if (item->nowildcard_len < item->len) {
> - 			pathspec->has_wildcard = 1;
> - 			if (path[item->nowildcard_len] == '*' &&
> - 			    no_wildcard(path + item->nowildcard_len + 1))
> - 				item->flags |= PATHSPEC_ONESTAR;
> ++		if (limit_pathspec_to_literal())
> ++			item->nowildcard_len = item->len;
> ++		else {
> ++			item->nowildcard_len = simple_length(path);
> ++			if (item->nowildcard_len < item->len) {
> ++				pathspec->has_wildcard = 1;
> ++				if (path[item->nowildcard_len] == '*' &&
> ++				    no_wildcard(path + item->nowildcard_len + 1))
> ++					item->flags |= PATHSPEC_ONESTAR;
> ++			}
>  +		}
>   	}
>   
>   	qsort(pathspec->items, pathspec->nr,

Hmph.  I thought that returning the length without any "stop at glob
special" trick from simple_length() would be a simpler resolution.

That is what is queued at the tip of 'pu', anyway.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-20  3:51     ` Junio C Hamano
@ 2012-12-20  3:55       ` Jeff King
  2012-12-20  4:06         ` Jeff King
  2012-12-20  5:26         ` Junio C Hamano
  0 siblings, 2 replies; 18+ messages in thread
From: Jeff King @ 2012-12-20  3:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyen Thai Ngoc Duy, git

On Wed, Dec 19, 2012 at 07:51:16PM -0800, Junio C Hamano wrote:

> > ++		if (limit_pathspec_to_literal())
> > ++			item->nowildcard_len = item->len;
> > ++		else {
> > ++			item->nowildcard_len = simple_length(path);
> > ++			if (item->nowildcard_len < item->len) {
> > ++				pathspec->has_wildcard = 1;
> > ++				if (path[item->nowildcard_len] == '*' &&
> > ++				    no_wildcard(path + item->nowildcard_len + 1))
> > ++					item->flags |= PATHSPEC_ONESTAR;
> > ++			}
> >  +		}
> 
> Hmph.  I thought that returning the length without any "stop at glob
> special" trick from simple_length() would be a simpler resolution.
> 
> That is what is queued at the tip of 'pu', anyway.

I don't think we can make a change in simple_length. It gets used not
only for pathspecs, but also for parsing exclude patterns, which I do
not think should be affected by this option.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-20  3:55       ` Jeff King
@ 2012-12-20  4:06         ` Jeff King
  2012-12-20  4:11           ` Jeff King
  2012-12-20  5:26         ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Jeff King @ 2012-12-20  4:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyen Thai Ngoc Duy, git

On Wed, Dec 19, 2012 at 10:55:43PM -0500, Jeff King wrote:

> On Wed, Dec 19, 2012 at 07:51:16PM -0800, Junio C Hamano wrote:
> 
> > > ++		if (limit_pathspec_to_literal())
> > > ++			item->nowildcard_len = item->len;
> > > ++		else {
> > > ++			item->nowildcard_len = simple_length(path);
> > > ++			if (item->nowildcard_len < item->len) {
> > > ++				pathspec->has_wildcard = 1;
> > > ++				if (path[item->nowildcard_len] == '*' &&
> > > ++				    no_wildcard(path + item->nowildcard_len + 1))
> > > ++					item->flags |= PATHSPEC_ONESTAR;
> > > ++			}
> > >  +		}
> > 
> > Hmph.  I thought that returning the length without any "stop at glob
> > special" trick from simple_length() would be a simpler resolution.
> > 
> > That is what is queued at the tip of 'pu', anyway.
> 
> I don't think we can make a change in simple_length. It gets used not
> only for pathspecs, but also for parsing exclude patterns, which I do
> not think should be affected by this option.

Our test suite wouldn't catch such a misfeature, of course, because the
feature is not turned on by default. But I found it instructive to run
all of the tests with GIT_LITERAL_PATHSPECS on. There are failures, of
course, but by inspecting each failure you can see that it is an
intended effect of the patch (i.e., each tries to use a wildcard
pathspec, which no longer works).

When you suggested changing common_prefix, I ran such a test both with
and without the change[1] and confirmed that it did not change the set
of failure sites. I did not try it, but I suspect running such a test
with the tip of pu would reveal new failures in the .gitignore tests.

-Peff

[1] This is in addition to reading and reasoning about the code, of
    course. I would not consider this a very robust form of testing,
    though a test failure which cannot be easily explained would be a
    good starting point for investigation.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-20  4:06         ` Jeff King
@ 2012-12-20  4:11           ` Jeff King
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff King @ 2012-12-20  4:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyen Thai Ngoc Duy, git

On Wed, Dec 19, 2012 at 11:06:02PM -0500, Jeff King wrote:

> > I don't think we can make a change in simple_length. It gets used not
> > only for pathspecs, but also for parsing exclude patterns, which I do
> > not think should be affected by this option.
> 
> Our test suite wouldn't catch such a misfeature, of course, because the
> feature is not turned on by default. But I found it instructive to run
> all of the tests with GIT_LITERAL_PATHSPECS on. There are failures, of
> course, but by inspecting each failure you can see that it is an
> intended effect of the patch (i.e., each tries to use a wildcard
> pathspec, which no longer works).
> 
> When you suggested changing common_prefix, I ran such a test both with
> and without the change[1] and confirmed that it did not change the set
> of failure sites. I did not try it, but I suspect running such a test
> with the tip of pu would reveal new failures in the .gitignore tests.

I just tried it, and indeed, running the test suite with this patch:

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 256f1c6..1c43593 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -102,6 +102,9 @@ export EDITOR
 export GIT_COMMITTER_EMAIL GIT_COMMITTER_NAME
 export EDITOR
 
+GIT_LITERAL_PATHSPECS=1
+export GIT_LITERAL_PATHSPECS
+
 # Add libc MALLOC and MALLOC_PERTURB test
 # only if we are not executing the test with valgrind
 if expr " $GIT_TEST_OPTS " : ".* --valgrind " >/dev/null ||


produces many more failures on "pu" than it does on jk/pathspec-literal.

-Peff

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] add core.pathspecGlob config option
  2012-12-20  3:55       ` Jeff King
  2012-12-20  4:06         ` Jeff King
@ 2012-12-20  5:26         ` Junio C Hamano
  1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2012-12-20  5:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Nguyen Thai Ngoc Duy, git

Jeff King <peff@peff.net> writes:

> On Wed, Dec 19, 2012 at 07:51:16PM -0800, Junio C Hamano wrote:
>
>> > ++		if (limit_pathspec_to_literal())
>> > ++			item->nowildcard_len = item->len;
>> > ++		else {
>> > ++			item->nowildcard_len = simple_length(path);
>> > ++			if (item->nowildcard_len < item->len) {
>> > ++				pathspec->has_wildcard = 1;
>> > ++				if (path[item->nowildcard_len] == '*' &&
>> > ++				    no_wildcard(path + item->nowildcard_len + 1))
>> > ++					item->flags |= PATHSPEC_ONESTAR;
>> > ++			}
>> >  +		}
>> 
>> Hmph.  I thought that returning the length without any "stop at glob
>> special" trick from simple_length() would be a simpler resolution.
>> 
>> That is what is queued at the tip of 'pu', anyway.
>
> I don't think we can make a change in simple_length. It gets used not
> only for pathspecs, but also for parsing exclude patterns, which I do
> not think should be affected by this option.

Ouch, yeah, you're right.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-12-20  5:26 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-19 20:34 [PATCH] add core.pathspecGlob config option Jeff King
2012-12-19 20:54 ` Junio C Hamano
2012-12-19 21:09   ` Jeff King
2012-12-19 21:50     ` [PATCH] add GIT_PATHSPEC_GLOB environment variable Jeff King
2012-12-19 22:00       ` Junio C Hamano
2012-12-19 22:12         ` Jeff King
2012-12-19 22:16           ` Junio C Hamano
2012-12-19 22:20             ` Jeff King
2012-12-19 22:37               ` Jeff King
2012-12-19 21:30 ` [PATCH] add core.pathspecGlob config option Junio C Hamano
2012-12-19 21:45   ` Jeff King
2012-12-20  1:28 ` Nguyen Thai Ngoc Duy
2012-12-20  3:13   ` Jeff King
2012-12-20  3:51     ` Junio C Hamano
2012-12-20  3:55       ` Jeff King
2012-12-20  4:06         ` Jeff King
2012-12-20  4:11           ` Jeff King
2012-12-20  5:26         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).