Git development
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Peter Baumann <waste.manager@gmx.de>
Cc: git@vger.kernel.org
Subject: Re: How to use path limiting (using a glob)?
Date: Wed, 11 Feb 2009 11:40:44 -0800 (PST)	[thread overview]
Message-ID: <alpine.LFD.2.00.0902111129190.3590@localhost.localdomain> (raw)
In-Reply-To: <20090211191432.GC27232@m62s10.vlinux.de>



On Wed, 11 Feb 2009, Peter Baumann wrote:

> after reading Junio's nice blog today where he explained how to use git grep
> efficiently, I saw him using a glob to match for the interesting files:
> 
> 	 $ git grep -e ';;' -- '*.c'
> 
> Is it possible to have the same feature in git diff and the revision
> machinery?

Not really. Git has two different kinds of path limiters, and they are 
really really different.

 - the "walk current index/directory recursively" kind that "git ls-files" 
   uses, which takes a 'fnmatch()' type path regexp (not a real regexp, 
   but the kind you're used to with shell)

   NOTE! On purpose, we don't set the FNM_PATHNAME, so "*.c" here is 
   different from *.c in shell (it's more like "**.c" in tcsh). IOW, * 
   matches '/' too, and will walk subdirectories.

 - the "revision limiter" pathspec. This is *not* a regexp, it's a pure 
   prefix matcher, for a very simple reason: performance.

> 	$ cd $path_to_your_git_src_dir
> 	$ git log master -p -- '*.h'
> 	.... No commit shown 
> 
> 	$ git diff --name-only v1.5.0  v1.6.0 -- '*.c'
> 
> and both don't return anything.

Yeah, in the revision matcher you can still depend on the shell 
expansion, and it will do _almost_ the right thing. So if you do

	git log master -p *.c

without the quotes, the shell expansion will work, and that in turn will 
give a set of filenames that "git log" will restrict the log to. HOWEVER, 
it's not a real wildcard - it's literally looking at what you have now in 
your current working directory, and saying "give me the logs of those 
pathnames", not "give me the logs of everything ending with .c".

We _could_ make the revision limiter understand fnmatch-style patterns, 
but quite frankly, it's very very expensive - too expensive to be useful 
for big repositories. The point about only matching prefixes is that it 
allows the revision limiter to not even walk into subdirectories that 
don't match, but if you do the "*.c" kind of pattern, now the revision 
code has to look up every tree recursively. That code is also _extremely_ 
performance-critical, so we really don't want to use fnmatch() when we can 
currently use just "memcmp()".

So yes, it's kind of odd how we have two totally different concepts of 
pathname patterns, but it's probably easiest to remember that "'git grep' 
is just special". 

		Linus

  reply	other threads:[~2009-02-11 19:42 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-11 19:14 How to use path limiting (using a glob)? Peter Baumann
2009-02-11 19:40 ` Linus Torvalds [this message]
2009-02-12 10:27   ` Peter Baumann
2009-02-12 11:09     ` Sitaram Chamarty
2009-02-11 19:48 ` Junio C Hamano
2009-02-11 21:09   ` Nanako Shiraishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.0902111129190.3590@localhost.localdomain \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=waste.manager@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox