git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brandon Williams <bmwill@google.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Git Mailing List" <git@vger.kernel.org>
Subject: Re: BUG: wildmatches like foo/**/**/bar don't match properly due to internal optimizations
Date: Wed, 26 Apr 2017 10:38:16 -0700	[thread overview]
Message-ID: <20170426173816.GA80265@google.com> (raw)
In-Reply-To: <CACsJy8Ct4VU0XAgJruFECEVxb98MS4P+9Z6D8ag35ySL6OY-0g@mail.gmail.com>

On 04/26, Duy Nguyen wrote:
> On Wed, Apr 26, 2017 at 2:13 AM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
> > Thought I'd just start another thread for this rather than tack it
> > onto the pathalogical case thread.
> >
> > In commit 4c251e5cb5 ("wildmatch: make /**/ match zero or more
> > directories", 2012-10-15) Duy added support for ** in globs.
> >
> > One test-case for this is:
> >
> >     match 1 0 'foo/baz/bar' 'foo/**/**/bar'
> >
> > I.e. foo/**/**/bar matches foo/baz/bar. However due to some
> > pre-pruning we do in pathspec/ls-tree we can't ever match it, because
> > the first thing we do is peel the first part of the path/pattern off,
> > i.e. foo/, and then match baz/bar against **/**/bar.
> 
> Yeah. I think the prefix compare trick predated wildmatch. When I
> introduced positional wildcards "**/" I failed to spot this. Good
> catch.
> 
> Ideally this sort of optimization should be contained within wildmatch
> (or whatever matching engine we'll be using). It also opens up more
> opportunity (like precompile pattern mentioned elsewhere in this
> thread).
> 
> You need to be careful though, when we do case-insensitive matching,
> sometimes we want to match the prefix case _sensitively_ instead. So
> we need to pass the "prefix" info in some cases to the matching
> engine.
> 
> I guess time is now ripe (i.e. somebody volunteers to work on this ;-)
> to improve wildmatch. "improve" can also be "rewriting to pcre" if we
> really want that route, which I have no opinion because I don't know
> pcre availability on other (some obscure) platforms.

If we do end up improving wildmatch, we may also want the functionality
to (with a flag) have a pattern match a leading directory.  This would
be useful in the submodule case where a user gives a pathspec which
could go into a submodule but we don't want to launch a child process to
operate on the submodule unless the first part of the pattern matches
the submodule.  Right now with recursive grep, if wildcards are used
then the code just punts and says "yeah this pattern may match something
in the submodule but we won't know for sure till we actually try".

-- 
Brandon Williams

      reply	other threads:[~2017-04-26 17:38 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-25 19:13 BUG: wildmatches like foo/**/**/bar don't match properly due to internal optimizations Ævar Arnfjörð Bjarmason
2017-04-26  8:56 ` Duy Nguyen
2017-04-26 17:38   ` Brandon Williams [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170426173816.GA80265@google.com \
    --to=bmwill@google.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).