From: Linus Torvalds <torvalds@linux-foundation.org>
To: Peter Baumann <waste.manager@gmx.de>
Cc: git@vger.kernel.org
Subject: Re: How to use path limiting (using a glob)?
Date: Wed, 11 Feb 2009 11:40:44 -0800 (PST) [thread overview]
Message-ID: <alpine.LFD.2.00.0902111129190.3590@localhost.localdomain> (raw)
In-Reply-To: <20090211191432.GC27232@m62s10.vlinux.de>
On Wed, 11 Feb 2009, Peter Baumann wrote:
> after reading Junio's nice blog today where he explained how to use git grep
> efficiently, I saw him using a glob to match for the interesting files:
>
> $ git grep -e ';;' -- '*.c'
>
> Is it possible to have the same feature in git diff and the revision
> machinery?
Not really. Git has two different kinds of path limiters, and they are
really really different.
- the "walk current index/directory recursively" kind that "git ls-files"
uses, which takes a 'fnmatch()' type path regexp (not a real regexp,
but the kind you're used to with shell)
NOTE! On purpose, we don't set the FNM_PATHNAME, so "*.c" here is
different from *.c in shell (it's more like "**.c" in tcsh). IOW, *
matches '/' too, and will walk subdirectories.
- the "revision limiter" pathspec. This is *not* a regexp, it's a pure
prefix matcher, for a very simple reason: performance.
> $ cd $path_to_your_git_src_dir
> $ git log master -p -- '*.h'
> .... No commit shown
>
> $ git diff --name-only v1.5.0 v1.6.0 -- '*.c'
>
> and both don't return anything.
Yeah, in the revision matcher you can still depend on the shell
expansion, and it will do _almost_ the right thing. So if you do
git log master -p *.c
without the quotes, the shell expansion will work, and that in turn will
give a set of filenames that "git log" will restrict the log to. HOWEVER,
it's not a real wildcard - it's literally looking at what you have now in
your current working directory, and saying "give me the logs of those
pathnames", not "give me the logs of everything ending with .c".
We _could_ make the revision limiter understand fnmatch-style patterns,
but quite frankly, it's very very expensive - too expensive to be useful
for big repositories. The point about only matching prefixes is that it
allows the revision limiter to not even walk into subdirectories that
don't match, but if you do the "*.c" kind of pattern, now the revision
code has to look up every tree recursively. That code is also _extremely_
performance-critical, so we really don't want to use fnmatch() when we can
currently use just "memcmp()".
So yes, it's kind of odd how we have two totally different concepts of
pathname patterns, but it's probably easiest to remember that "'git grep'
is just special".
Linus
next prev parent reply other threads:[~2009-02-11 19:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-11 19:14 How to use path limiting (using a glob)? Peter Baumann
2009-02-11 19:40 ` Linus Torvalds [this message]
2009-02-12 10:27 ` Peter Baumann
2009-02-12 11:09 ` Sitaram Chamarty
2009-02-11 19:48 ` Junio C Hamano
2009-02-11 21:09 ` Nanako Shiraishi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.0902111129190.3590@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=git@vger.kernel.org \
--cc=waste.manager@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox