git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* grep --no-index and pathspec
@ 2011-02-11  8:59 Lars Noschinski
  2011-02-11 15:04 ` Michael J Gruber
  0 siblings, 1 reply; 8+ messages in thread
From: Lars Noschinski @ 2011-02-11  8:59 UTC (permalink / raw)
  To: git

Hi everyone,

I encountered some strange behaviour with grep when using both the  
--no-index option and a pathspec. Glob patterns seem to be ignored:

----------
$ git grep -l --no-index . -- '*.bib'
paper.bib
paper.tex
ex1.tex
----------

But on the other hands, leading path matches work:
----------
$ git grep -l --no-index . -- 'paper'
paper.bib
paper.tex
----------

Without the --no-index option, everything works fine:
----------
$ git grep -l --no-index . -- '*.bib'
paper.bib
----------

This is with git version 1.7.4, but I encountered it also with the  
1.7.2.3 Debian package.

   -- Lars

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: grep --no-index and pathspec
  2011-02-11  8:59 grep --no-index and pathspec Lars Noschinski
@ 2011-02-11 15:04 ` Michael J Gruber
  2011-02-11 15:06   ` [PATCH] grep.txt: document pathspec for --no-index Michael J Gruber
  2011-02-11 18:27   ` grep --no-index and pathspec Junio C Hamano
  0 siblings, 2 replies; 8+ messages in thread
From: Michael J Gruber @ 2011-02-11 15:04 UTC (permalink / raw)
  To: Lars Noschinski; +Cc: git, Junio C Hamano

Lars Noschinski venit, vidit, dixit 11.02.2011 09:59:
> Hi everyone,
> 
> I encountered some strange behaviour with grep when using both the  
> --no-index option and a pathspec. Glob patterns seem to be ignored:
> 
> ----------
> $ git grep -l --no-index . -- '*.bib'
> paper.bib
> paper.tex
> ex1.tex
> ----------
> 
> But on the other hands, leading path matches work:
> ----------
> $ git grep -l --no-index . -- 'paper'
> paper.bib
> paper.tex
> ----------
> 
> Without the --no-index option, everything works fine:
> ----------
> $ git grep -l --no-index . -- '*.bib'
> paper.bib
> ----------
> 
> This is with git version 1.7.4, but I encountered it also with the  
> 1.7.2.3 Debian package.

"grep --no-index" and "grep" have different codepaths for looking up the
files/blobs. If I read that correctly then "grep --no-index -- pathspec"
only does a literal match at the left boundary, whereas for the normal
mode glob patterns are allowed.

CC'ing Junio who created "--no-index".

Michael

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] grep.txt: document pathspec for --no-index
  2011-02-11 15:04 ` Michael J Gruber
@ 2011-02-11 15:06   ` Michael J Gruber
  2011-02-11 18:27   ` grep --no-index and pathspec Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Michael J Gruber @ 2011-02-11 15:06 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Lars Noschinski

because it allows leading path match only, no globs.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
---
 Documentation/git-grep.txt |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index dab0a78..ef01a57 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -186,7 +186,8 @@ OPTIONS
 
 <pathspec>...::
 	If given, limit the search to paths matching at least one pattern.
-	Both leading paths match and glob(7) patterns are supported.
+	Both leading paths match and glob(7) patterns are supported
+	unless `--no-index` is used, which supports only the former.
 
 Examples
 --------
-- 
1.7.4.91.g3d0bb

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: grep --no-index and pathspec
  2011-02-11 15:04 ` Michael J Gruber
  2011-02-11 15:06   ` [PATCH] grep.txt: document pathspec for --no-index Michael J Gruber
@ 2011-02-11 18:27   ` Junio C Hamano
  2011-02-11 21:37     ` Junio C Hamano
  1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2011-02-11 18:27 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Lars Noschinski, git, Junio C Hamano

Michael J Gruber <git@drmicha.warpmail.net> writes:

> "grep --no-index" and "grep" have different codepaths for looking up the
> files/blobs. If I read that correctly then "grep --no-index -- pathspec"
> only does a literal match at the left boundary, whereas for the normal
> mode glob patterns are allowed.
>
> CC'ing Junio who created "--no-index".

Anything with --no-index is a quick hack, so I wouldn't be surprised if it
ignored the normal pathspec logic.  As I do not recall the details of the
particular codepath and offhand do not know how involved a change to pay
proper attention to the pathspecs would be, but I suspect that it would be
more appropriate to fix it on top of nd/struct-pathspec topic than writing
the current behaviour down in the documentation outside of BUGS section as
if it were a feature ;-).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: grep --no-index and pathspec
  2011-02-11 18:27   ` grep --no-index and pathspec Junio C Hamano
@ 2011-02-11 21:37     ` Junio C Hamano
  2011-02-12  8:14       ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2011-02-11 21:37 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Nguyễn Thái Ngọc Duy, Lars Noschinski, git

Junio C Hamano <gitster@pobox.com> writes:

> Michael J Gruber <git@drmicha.warpmail.net> writes:
>
>> "grep --no-index" and "grep" have different codepaths for looking up the
>> files/blobs. If I read that correctly then "grep --no-index -- pathspec"
>> only does a literal match at the left boundary, whereas for the normal
>> mode glob patterns are allowed.
>>
>> CC'ing Junio who created "--no-index".
>
> Anything with --no-index is a quick hack, so I wouldn't be surprised if it
> ignored the normal pathspec logic.  As I do not recall the details of the
> particular codepath and offhand do not know how involved a change to pay
> proper attention to the pathspecs would be, but I suspect that it would be
> more appropriate to fix it on top of nd/struct-pathspec topic than writing
> the current behaviour down in the documentation outside of BUGS section as
> if it were a feature ;-).

This is a band-aid modelled after what builtin/clean.c does to the
returned list from fill_directory(), and it seems to do its job, but I am
quite unhappy about it.

The function fill_directory() already takes a pathspec, albeit in the
degenerate "const char **" form.  Why does its output need further
filtering?

 builtin/grep.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index c3af876..5afee2f 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -626,6 +626,10 @@ static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec)
 
 	fill_directory(&dir, pathspec->raw);
 	for (i = 0; i < dir.nr; i++) {
+		const char *name = dir.entries[i]->name;
+		int namelen = strlen(name);
+		if (!match_pathspec_depth(pathspec, name, namelen, 0, NULL))
+			continue;
 		hit |= grep_file(opt, dir.entries[i]->name);
 		if (hit && opt->status_only)
 			break;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: grep --no-index and pathspec
  2011-02-11 21:37     ` Junio C Hamano
@ 2011-02-12  8:14       ` Nguyen Thai Ngoc Duy
  2011-02-12  8:26         ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-12  8:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, Lars Noschinski, git

2011/2/12 Junio C Hamano <gitster@pobox.com>:
> This is a band-aid modelled after what builtin/clean.c does to the
> returned list from fill_directory(), and it seems to do its job, but I am
> quite unhappy about it.
>
> The function fill_directory() already takes a pathspec, albeit in the
> degenerate "const char **" form.  Why does its output need further
> filtering?

Because it was designed so? Quotes from 9fc42d6 (Optimize directory
listing with pathspec limiter. - 2007-03-30), which added
simplify_away(), the function that does pathspec filtering for
fill_directory():

    NOTE! This does *not* obviate the need for the caller to do the *exact*
    pathspec match later. It's a first-level filter on "read_directory()", but
    it does not do the full pathspec thing. Maybe it should. But in the
    meantime, builtin-add.c really does need to do first

        read_directory(dir, .., pathspec);
        if (pathspec)
                prune_directory(dir, pathspec, baselen);

    ie the "prune_directory()" part will do the *exact* pathspec pruning,
    while the "read_directory()" will use the pathspec just to do some quick
    high-level pruning of the directories it will recurse into.

> @@ -626,6 +626,10 @@ static int grep_directory(struct grep_opt *opt, const struct pathspec *pathspec)
>
>        fill_directory(&dir, pathspec->raw);
>        for (i = 0; i < dir.nr; i++) {
> +               const char *name = dir.entries[i]->name;
> +               int namelen = strlen(name);
> +               if (!match_pathspec_depth(pathspec, name, namelen, 0, NULL))
> +                       continue;
>                hit |= grep_file(opt, dir.entries[i]->name);
>                if (hit && opt->status_only)
>                        break;

Looks good. We could move prune_directory() from builtin/add.c to
dir.c and use it here, but the gain is nothing (except noticing people
some pathspecs do not match any).
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: grep --no-index and pathspec
  2011-02-12  8:14       ` Nguyen Thai Ngoc Duy
@ 2011-02-12  8:26         ` Junio C Hamano
  2011-02-12  8:39           ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2011-02-12  8:26 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: Michael J Gruber, Lars Noschinski, git

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> 2011/2/12 Junio C Hamano <gitster@pobox.com>:
>>
>> The function fill_directory() already takes a pathspec, albeit in the
>> degenerate "const char **" form. Why does its output need further
>> filtering?
>
> Because it was designed so? Quotes from 9fc42d6 (Optimize directory
> listing with pathspec limiter. - 2007-03-30), which added
> simplify_away(), the function that does pathspec filtering for
> fill_directory():
>
>     NOTE! This does *not* obviate the need for the caller to do the *exact*
>     pathspec match later. It's a first-level filter on "read_directory()", but
>     it does not do the full pathspec thing. Maybe it should. But in the
>     meantime,...

I was around back then, so I know how the code came about ;-)

The pieces used in the pathspec limiting logic have been restructured well
enough that I suspect it may now be feasible for us to revisit the "Maybe
it should" part in the above quote.  Thanks to nd/struct-pathspec topic, I
think we are already half-way there.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: grep --no-index and pathspec
  2011-02-12  8:26         ` Junio C Hamano
@ 2011-02-12  8:39           ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 8+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-12  8:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael J Gruber, Lars Noschinski, git

On Sat, Feb 12, 2011 at 3:26 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>>     NOTE! This does *not* obviate the need for the caller to do the *exact*
>>     pathspec match later. It's a first-level filter on "read_directory()", but
>>     it does not do the full pathspec thing. Maybe it should. But in the
>>     meantime,...
>
> I was around back then, so I know how the code came about ;-)
>
> The pieces used in the pathspec limiting logic have been restructured well
> enough that I suspect it may now be feasible for us to revisit the "Maybe
> it should" part in the above quote.  Thanks to nd/struct-pathspec topic, I
> think we are already half-way there.

I was around too, just oblivious about things. I can look into that.
Need to think a bit how to save what pathspecs are "seen", so that
prune_directory() in builtin/add.c can be dropped.
-- 
Duy

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-02-12  8:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-11  8:59 grep --no-index and pathspec Lars Noschinski
2011-02-11 15:04 ` Michael J Gruber
2011-02-11 15:06   ` [PATCH] grep.txt: document pathspec for --no-index Michael J Gruber
2011-02-11 18:27   ` grep --no-index and pathspec Junio C Hamano
2011-02-11 21:37     ` Junio C Hamano
2011-02-12  8:14       ` Nguyen Thai Ngoc Duy
2011-02-12  8:26         ` Junio C Hamano
2011-02-12  8:39           ` Nguyen Thai Ngoc Duy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).