* [RFC] Making ce_path_match() more useful by accepting globs
@ 2007-11-25 18:03 Junio C Hamano
2007-11-26 0:11 ` Alex Riesen
0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2007-11-25 18:03 UTC (permalink / raw)
To: Linus, "Torvalds <torvalds"; +Cc: git
Currently ce_path_match() only uses "the leading directory" match, and
does not understand file globs. These do not work:
git diff-files 't/*.sh'
git diff-index HEAD 'xdiff/*.c'
git update-index -g 'Documentation/howto/*.txt'
This teaches the ce_path_match(), the underlying function that are used
for checking if a given cache entry matches the given set of pathspecs,
to use the match_pathspec() from git-ls-files, which knows about glob
patterns.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* Having two different behaviours of pathspec matching has been
bothering me for quite some time. The changes here look trivially
correct and the result passes all the tests, but this is quite close
to the core part of the system, and would benefit greatly from extra
set of eyes.
This patch does not touch the tree walker, and does not affect
diff-tree nor ancestry pruning done in the revision traversal. That
however is even closer to the core and is performance critical. It
needs to be done carefully not to descend into trees that would never
match needlessly. IOW, not today.
dir.c | 5 +++--
read-cache.c | 17 ++---------------
2 files changed, 5 insertions(+), 17 deletions(-)
diff --git a/dir.c b/dir.c
index 225fdfb..be640c9 100644
--- a/dir.c
+++ b/dir.c
@@ -98,20 +98,21 @@ int match_pathspec(const char **pathspec, const char *name, int namelen, int pre
{
int retval;
const char *match;
+ int want_seen = !!seen;
name += prefix;
namelen -= prefix;
for (retval = 0; (match = *pathspec++) != NULL; seen++) {
int how;
- if (retval && *seen == MATCHED_EXACTLY)
+ if (retval && seen && want_seen && *seen == MATCHED_EXACTLY)
continue;
match += prefix;
how = match_one(match, name, namelen);
if (how) {
if (retval < how)
retval = how;
- if (*seen < how)
+ if (want_seen && *seen < how)
*seen = how;
}
}
diff --git a/read-cache.c b/read-cache.c
index 7db5588..767464e 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -472,7 +472,7 @@ int ce_same_name(struct cache_entry *a, struct cache_entry *b)
int ce_path_match(const struct cache_entry *ce, const char **pathspec)
{
- const char *match, *name;
+ const char *name;
int len;
if (!pathspec)
@@ -480,20 +480,7 @@ int ce_path_match(const struct cache_entry *ce, const char **pathspec)
len = ce_namelen(ce);
name = ce->name;
- while ((match = *pathspec++) != NULL) {
- int matchlen = strlen(match);
- if (matchlen > len)
- continue;
- if (memcmp(name, match, matchlen))
- continue;
- if (matchlen && name[matchlen-1] == '/')
- return 1;
- if (name[matchlen] == '/' || !name[matchlen])
- return 1;
- if (!matchlen)
- return 1;
- }
- return 0;
+ return !!match_pathspec(pathspec, name, len, 0, NULL);
}
/*
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC] Making ce_path_match() more useful by accepting globs
2007-11-25 18:03 [RFC] Making ce_path_match() more useful by accepting globs Junio C Hamano
@ 2007-11-26 0:11 ` Alex Riesen
2007-11-26 0:30 ` Junio C Hamano
0 siblings, 1 reply; 5+ messages in thread
From: Alex Riesen @ 2007-11-26 0:11 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, git
Junio C Hamano, Sun, Nov 25, 2007 19:03:12 +0100:
> Currently ce_path_match() only uses "the leading directory" match, and
> does not understand file globs. These do not work:
>
> git diff-files 't/*.sh'
> git diff-index HEAD 'xdiff/*.c'
> git update-index -g 'Documentation/howto/*.txt'
How should my scripts handle files with "*" in names?
Maybe before the proposed change we should make all plumbing accept
"--stdin" (which does not expand globs)?
P.S. you have something broken in your mailer:
To: Linus@sceptre.sasl.smtp.pobox.com,
"Torvalds <torvalds"@linux-fundation.org
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] Making ce_path_match() more useful by accepting globs
2007-11-26 0:11 ` Alex Riesen
@ 2007-11-26 0:30 ` Junio C Hamano
2007-11-26 0:48 ` Junio C Hamano
2007-11-26 19:38 ` Alex Riesen
0 siblings, 2 replies; 5+ messages in thread
From: Junio C Hamano @ 2007-11-26 0:30 UTC (permalink / raw)
To: Alex Riesen; +Cc: Linus Torvalds, git
Alex Riesen <raa.lkml@gmail.com> writes:
> Junio C Hamano, Sun, Nov 25, 2007 19:03:12 +0100:
>> Currently ce_path_match() only uses "the leading directory" match, and
>> does not understand file globs. These do not work:
>>
>> git diff-files 't/*.sh'
>> git diff-index HEAD 'xdiff/*.c'
>> git update-index -g 'Documentation/howto/*.txt'
>
> How should my scripts handle files with "*" in names?
We DO NOT CARE.
Why?
How would you handle such files from the command line session without
git? "ls such-*-a-file" will also show such-silly-a-file as well.
IOW, the user is shooting in the foot --- and at that point I am not all
that interested in helping him.
Having said that, I would think that quoting the meta from fnmatch(3)
like this:
git ls-files 'such-\*-a-file'
would work fine, just like
ls such-\*-a-file
would.
If "ls such-*-a-file" reports only one file,
git ls-files 'such-*-a-file'
would also report that file as well.
So in practice I do not see a problem.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] Making ce_path_match() more useful by accepting globs
2007-11-26 0:30 ` Junio C Hamano
@ 2007-11-26 0:48 ` Junio C Hamano
2007-11-26 19:38 ` Alex Riesen
1 sibling, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2007-11-26 0:48 UTC (permalink / raw)
To: Alex Riesen; +Cc: Linus Torvalds, git
Junio C Hamano <gitster@pobox.com> writes:
> Alex Riesen <raa.lkml@gmail.com> writes:
>
>> Junio C Hamano, Sun, Nov 25, 2007 19:03:12 +0100:
>>> Currently ce_path_match() only uses "the leading directory" match, and
>>> does not understand file globs. These do not work:
>>>
>>> git diff-files 't/*.sh'
>>> git diff-index HEAD 'xdiff/*.c'
>>> git update-index -g 'Documentation/howto/*.txt'
>>
>> How should my scripts handle files with "*" in names?
> ...
> Having said that, I would think that quoting the meta from fnmatch(3)
> like this:
>
> git ls-files 'such-\*-a-file'
>
> would work fine, just like
>
> ls such-\*-a-file
>
> would.
Yup. I just did:
$ >M\*kefile
$ git add 'M\*kefile'
$ git ls-files 'M\*kefile'
M*kefile
$ git ls-files 'M*kefile'
M*kefile
Makefile
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC] Making ce_path_match() more useful by accepting globs
2007-11-26 0:30 ` Junio C Hamano
2007-11-26 0:48 ` Junio C Hamano
@ 2007-11-26 19:38 ` Alex Riesen
1 sibling, 0 replies; 5+ messages in thread
From: Alex Riesen @ 2007-11-26 19:38 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Linus Torvalds, git
Junio C Hamano, Mon, Nov 26, 2007 01:30:20 +0100:
> Alex Riesen <raa.lkml@gmail.com> writes:
>
> > Junio C Hamano, Sun, Nov 25, 2007 19:03:12 +0100:
> >> Currently ce_path_match() only uses "the leading directory" match, and
> >> does not understand file globs. These do not work:
> >>
> >> git diff-files 't/*.sh'
> >> git diff-index HEAD 'xdiff/*.c'
> >> git update-index -g 'Documentation/howto/*.txt'
> >
> > How should my scripts handle files with "*" in names?
>
> We DO NOT CARE.
>
> Why?
>
> How would you handle such files from the command line session without
> git? "ls such-*-a-file" will also show such-silly-a-file as well.
It will break existing setups. Something like
#!/bin/bash
git diff-files --quiet -- "$@" || do_something
will behave differently
> IOW, the user is shooting in the foot --- and at that point I am not all
> that interested in helping him.
What with? What's wrong with a name like "M*A*S*H" in your personal IMDB?
> Having said that, I would think that quoting the meta from fnmatch(3)
> like this:
>
> git ls-files 'such-\*-a-file'
>
> would work fine, just like
>
> ls such-\*-a-file
>
> would.
An existing system will still be broken.
> If "ls such-*-a-file" reports only one file,
>
> git ls-files 'such-*-a-file'
>
> would also report that file as well.
>
> So in practice I do not see a problem.
I think I do. There is NO way to pass the filenames to git plumbing
(only the diff-, or is git update-index --stdin -z also subject to
glob expansion?) without being sure they are exactly they were on
filesystem. Quoting is just PITA for scripting.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-11-26 19:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-25 18:03 [RFC] Making ce_path_match() more useful by accepting globs Junio C Hamano
2007-11-26 0:11 ` Alex Riesen
2007-11-26 0:30 ` Junio C Hamano
2007-11-26 0:48 ` Junio C Hamano
2007-11-26 19:38 ` Alex Riesen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).