* [RFC] The design of new pathspec features
@ 2013-01-29 4:35 Duy Nguyen
2013-01-29 5:05 ` Junio C Hamano
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Duy Nguyen @ 2013-01-29 4:35 UTC (permalink / raw)
To: git
For those who haven't followed closely, some coming changes allow us
to extend current pathspec syntax. We should soon be able to do
case-insenstive matching for example, or introduce "**" wildcard that
is currently used by gitignore. I just want to discuss about the new
syntax and behavior.
Many of these are already implemented in [1]. But I don't want you to
bother with buggy code yet. I'll resend it soon after 1.8.2.
--literal-pathspecs
===================
This feature is added by Jeff to disable globbing for all pathspecs. I
want to push it a bit further: disable all pathspec magic. This means
even ":/" is treated literally when --literal-pathspecs is set.
Because the intent behind this, as I understand, is for scripting, it
makes sense to keep it as literal as possible.
:(literal) magic
================
This magic is for people who want simple no-globbing pathspec (*). It
can be used in combination with other magic such as case-insensitive
matching. Incompatible with :(glob) magic below.
Global option --noglob-pathspecs is added to add :(literal) to
all. This is very similar to --literal-pathspecs. It just does not
disable pathspec magic. :(glob) magic overrides this global option.
(*) you can always disable wildcards by quoting them using backslash,
but that's inconvenient
:(glob) magic
=============
This magic is for people who want globbing. However, it does _not_ use
the same matching mechanism the non-magic pathspec does today. It uses
wildmatch(WM_PATHNAME), which basically means '*' does not match
slashes and "**" does.
Global option --glob-pathspecs is added to add :(glob) to all
pathspec. :(literal) magic overrides this global option.
fnmatch without FNM_PATHNAME is deprecated
==========================================
With the two magic above, people can switch between literal and new
globbing. There is no way to regain current matching behavior once
--[no]glob-pathspecs is used. And I think that's a good thing. New
globbing is more powerful than the current one. At some point, I'd
like to switch the matching behavior when neither literal nor magic
pathspec is specified. Either:
- make it literal by default
- make it new globbing by default
Which is more often used should be come the default. The question is
which.
Pathspec mnemonic
=================
Are :(literal) and :(glob) used often enough to deserve a short
mnemonic (like :/ is equivalent to :(top))? Which symbols should be
used? We can only use non-alphanumeric here, and '(', ')', ':' and '/'
are taken. It should be friendly to UNIX shell, no quoting is
preferred.
Another magic will come soon: case-insensitive matching. We may want
to reserve a mnemonic symbol for it as well.
We may also want to reserve option shortcuts for --noglob-pathspecs
and --glob-pathspecs. I suspect they'll be used more often.
New way to specify long pathspec magic
======================================
While testing the pathspec magic code, I grow tired of quoting :(glob)
every time because '(' is the start of a new shell. Which is one of
the reasons I introduce --[no]glob-pathspecs. Still I'd like a way to
specify long pathspec magic without quoting.
How about making ":q/xxx/" an equivalence of ":(xxx)"? Any character
(except ',') following 'q' is used as separator (similar to s/// from
sed). This violates the guidelines set in glossary-content.txt (only
use non-alphanumeric).
Another step futher is remove support for ":(xxx)" in favor of
":q(xxx)". We can do it today because I don't think anybody is using
":(top)" (the only supported magic) yet.
[1] https://github.com/pclouds/git/tree/parse-pathspec
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 4:35 [RFC] The design of new pathspec features Duy Nguyen
@ 2013-01-29 5:05 ` Junio C Hamano
2013-01-29 5:27 ` Duy Nguyen
2013-01-29 5:07 ` Junio C Hamano
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2013-01-29 5:05 UTC (permalink / raw)
To: Duy Nguyen; +Cc: git
Duy Nguyen <pclouds@gmail.com> writes:
> Pathspec mnemonic
> =================
>
> Are :(literal) and :(glob) used often enough to deserve a short
> mnemonic (like :/ is equivalent to :(top))? Which symbols should be
> used?
I do not think we should discuss this before letting people gain
experience with various forms of magic; otherwise we would at best
end up with a concensus guess among uninformed.
> New way to specify long pathspec magic
> ======================================
>
> While testing the pathspec magic code, I grow tired of quoting :(glob)
> every time because '(' is the start of a new shell. Which is one of
> the reasons I introduce --[no]glob-pathspecs. Still I'd like a way to
> specify long pathspec magic without quoting.
Is this a real issue, though? Often interesting pathspecs do have
shell globs in them and we have to quote them anyway.
> How about making ":q/xxx/" an equivalence of ":(xxx)"?
A moderately strong no from here.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 5:05 ` Junio C Hamano
@ 2013-01-29 5:27 ` Duy Nguyen
0 siblings, 0 replies; 10+ messages in thread
From: Duy Nguyen @ 2013-01-29 5:27 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Tue, Jan 29, 2013 at 12:05 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> New way to specify long pathspec magic
>> ======================================
>>
>> While testing the pathspec magic code, I grow tired of quoting :(glob)
>> every time because '(' is the start of a new shell. Which is one of
>> the reasons I introduce --[no]glob-pathspecs. Still I'd like a way to
>> specify long pathspec magic without quoting.
>
> Is this a real issue, though? Often interesting pathspecs do have
> shell globs in them and we have to quote them anyway.
:(icase) often won't (unless you combine with :(glob)). If we turn
grep's --max-depth feature into pathspec magic (feasible, just not
sure if it's actually useful), it won't need quoting either.
Even with :(glob), because of the complexity of the pathspec, it's
less likely to match anything and be expanded by bash, so no quoting
required.
>> How about making ":q/xxx/" an equivalence of ":(xxx)"?
>
> A moderately strong no from here.
--
Duy
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 4:35 [RFC] The design of new pathspec features Duy Nguyen
2013-01-29 5:05 ` Junio C Hamano
@ 2013-01-29 5:07 ` Junio C Hamano
2013-01-29 5:33 ` Junio C Hamano
2013-01-29 12:17 ` Duy Nguyen
3 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2013-01-29 5:07 UTC (permalink / raw)
To: Duy Nguyen; +Cc: git
Duy Nguyen <pclouds@gmail.com> writes:
> For those who haven't followed closely, some coming changes allow us
> to extend current pathspec syntax. We should soon be able to do
> case-insenstive matching for example, or introduce "**" wildcard that
> is currently used by gitignore. I just want to discuss about the new
> syntax and behavior.
>
> Many of these are already implemented in [1]. But I don't want you to
> bother with buggy code yet. I'll resend it soon after 1.8.2.
I think I agree with everything up to the "fnmatch(~FNM_PATHNAME)"
part, from a quick read.
Thanks for writing this up.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 4:35 [RFC] The design of new pathspec features Duy Nguyen
2013-01-29 5:05 ` Junio C Hamano
2013-01-29 5:07 ` Junio C Hamano
@ 2013-01-29 5:33 ` Junio C Hamano
2013-01-29 6:13 ` Duy Nguyen
2013-01-29 12:17 ` Duy Nguyen
3 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2013-01-29 5:33 UTC (permalink / raw)
To: Duy Nguyen; +Cc: git
Duy Nguyen <pclouds@gmail.com> writes:
> :(literal) magic
> ================
>
> This magic is for people who want simple no-globbing pathspec (*). It
> can be used in combination with other magic such as case-insensitive
> matching. Incompatible with :(glob) magic below.
>
> Global option --noglob-pathspecs is added to add :(literal) to
> all. This is very similar to --literal-pathspecs. It just does not
> disable pathspec magic. :(glob) magic overrides this global option.
>
> (*) you can always disable wildcards by quoting them using backslash,
> but that's inconvenient
Have you considered if it may be helpful to have a :(literal) magic
(or any magic in general) that applies only to the first N
characters of the pathspec pattern?
When you are in subdirectory and do a pathspec limited operation,
e.g.
cd Documentation && git ls-files "*.txt"
we internaly do an equivalent of this:
(1) first find out the "prefix", e.g. "Documentation/" in this
case;
(2) prepend the prefix to user-supplied pathspecs, e.g. yielding
"Documentation/*.txt" in this case; and
(3) use the resulting pathspecs to match against full pathnames
relative to the root of the working tree.
If the prefix had globbing character in it (e.g. we started in a
directory "D*cumentati*n" instead), we still should make sure that
that part matches literally, while allowing the globbing in
user-supplied part of the pathspec (e.g. "*.txt"). In the built in
code, you can work with the struct pathspec directly and mark the
entire prefix part with nowildcard_len field to match literally, but
if the above three-step logic needs to be implemented by a Porcelain
script like old days, they would need to quote glob specials in the
prefix part before appending user-supplied part to form the full
pathspec string.
I personally think we do not need to support something like this:
prefix=$(git rev-parse --show-prefix)
n=${#prefix}
pathspec=":(literal-$n)$prefix$1"
but other aspiring Porcelain script writers may disagree and would
want to have it. We can always solve it by giving them an easy and
uniform way to get the glob-quoted version of prefix to solve this
particular issue, i.e.
prefixq=$(git rev-parse --show-prefix-glob-quoted)
pathspec="$prefixq$1"
but magic that applies only to a substring may have other uses.
If you do not immediately think of any, let's not overengineer this.
Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 5:33 ` Junio C Hamano
@ 2013-01-29 6:13 ` Duy Nguyen
2013-01-29 6:31 ` Junio C Hamano
0 siblings, 1 reply; 10+ messages in thread
From: Duy Nguyen @ 2013-01-29 6:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Tue, Jan 29, 2013 at 12:33 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Have you considered if it may be helpful to have a :(literal) magic
> (or any magic in general) that applies only to the first N
> characters of the pathspec pattern?
Not user-driven. But the prefix part is :(literal)-applied. :(glob) is
currently implemented this way, using nowildcard_len as you mentioned.
:(icase) is more complicated and does not follow yet.
> I personally think we do not need to support something like this:
>
> prefix=$(git rev-parse --show-prefix)
> n=${#prefix}
> pathspec=":(literal-$n)$prefix$1"
>
> but other aspiring Porcelain script writers may disagree and would
> want to have it. We can always solve it by giving them an easy and
> uniform way to get the glob-quoted version of prefix to solve this
> particular issue, i.e.
>
> prefixq=$(git rev-parse --show-prefix-glob-quoted)
> pathspec="$prefixq$1"
>
> but magic that applies only to a substring may have other uses.
Yeah, that simplifies things. Supporting applying magic over just
parts of the pathspec pattern sounds complex. Just a small
modification. That rev-parse needs to look at "$1" as well. If
:(literal) is already specified, glob quoting will backfire. The user
script can deal with that, but it's harder (e.g. parsing magic from
scripts and deal with magic combination) than letting rev-parse does
it.
I've done some form of this already, for supporting add--interactive.
git-add prefixes the pathspec but keeps all the magic in place, before
passing pathspec to add--interactive. But I missed the quoting point
you mentioned above. I probably need de-quoting the prefix as well.
Many optimizations stop short at the sign of any glob symbols,
including backslash. This could be a new task for wildmatch.
--
Duy
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 6:13 ` Duy Nguyen
@ 2013-01-29 6:31 ` Junio C Hamano
0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2013-01-29 6:31 UTC (permalink / raw)
To: Duy Nguyen; +Cc: git
Duy Nguyen <pclouds@gmail.com> writes:
>> prefixq=$(git rev-parse --show-prefix-glob-quoted)
>> pathspec="$prefixq$1"
>>
>> but magic that applies only to a substring may have other uses.
>
> Yeah, that simplifies things. Supporting applying magic over just
> parts of the pathspec pattern sounds complex. Just a small
> modification. That rev-parse needs to look at "$1" as well.
Makes sense. More like
prefix=$(git rev-parse --show-prefix)
test -z "$prefix" || cd $(git rev-parse --show-cdup)
...
pathspec=$(git rev-parse --prefix-pathspec "$prefix" "$1")
git ls-files "$pathspec"
which may take prefix="D*cumentati*n" (from the previous example),
and the user may have given ":(icase)hell*.txt" to "$1". Ideally
the scriptors shouldn't have to worry about how to add the prefix to
the user supplied pathspec.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 4:35 [RFC] The design of new pathspec features Duy Nguyen
` (2 preceding siblings ...)
2013-01-29 5:33 ` Junio C Hamano
@ 2013-01-29 12:17 ` Duy Nguyen
2013-01-29 16:20 ` Junio C Hamano
3 siblings, 1 reply; 10+ messages in thread
From: Duy Nguyen @ 2013-01-29 12:17 UTC (permalink / raw)
To: git
On Tue, Jan 29, 2013 at 11:35 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> :(glob) magic
> =============
>
> This magic is for people who want globbing. However, it does _not_ use
> the same matching mechanism the non-magic pathspec does today. It uses
> wildmatch(WM_PATHNAME), which basically means '*' does not match
> slashes and "**" does.
>
> Global option --glob-pathspecs is added to add :(glob) to all
> pathspec. :(literal) magic overrides this global option.
I forgot one thing. The current pathspec behavior, the pattern "[a-z]"
would match a file named "[a-z]" (iow, wildcards are also considered
literal characters). Do we want to keep this behavior in :(glob)? To
me it's a surprise factor and therefore not good. But common sense may
be different.
--
Duy
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 12:17 ` Duy Nguyen
@ 2013-01-29 16:20 ` Junio C Hamano
2013-01-29 16:27 ` Junio C Hamano
0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2013-01-29 16:20 UTC (permalink / raw)
To: Duy Nguyen; +Cc: git
Duy Nguyen <pclouds@gmail.com> writes:
> On Tue, Jan 29, 2013 at 11:35 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>> :(glob) magic
>> =============
>>
>> This magic is for people who want globbing. However, it does _not_ use
>> the same matching mechanism the non-magic pathspec does today. It uses
>> wildmatch(WM_PATHNAME), which basically means '*' does not match
>> slashes and "**" does.
>>
>> Global option --glob-pathspecs is added to add :(glob) to all
>> pathspec. :(literal) magic overrides this global option.
>
> I forgot one thing. The current pathspec behavior, the pattern "[a-z]"
> would match a file named "[a-z]" (iow, wildcards are also considered
> literal characters).
That sounds like a blatant bug to me (unless you are talking about
"literal" case). We should fix it before we include it in any
released version, I think.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] The design of new pathspec features
2013-01-29 16:20 ` Junio C Hamano
@ 2013-01-29 16:27 ` Junio C Hamano
0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2013-01-29 16:27 UTC (permalink / raw)
To: Duy Nguyen; +Cc: git
Junio C Hamano <gitster@pobox.com> writes:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> On Tue, Jan 29, 2013 at 11:35 AM, Duy Nguyen <pclouds@gmail.com> wrote:
>>> :(glob) magic
>>> =============
>>>
>>> This magic is for people who want globbing. However, it does _not_ use
>>> the same matching mechanism the non-magic pathspec does today. It uses
>>> wildmatch(WM_PATHNAME), which basically means '*' does not match
>>> slashes and "**" does.
>>>
>>> Global option --glob-pathspecs is added to add :(glob) to all
>>> pathspec. :(literal) magic overrides this global option.
>>
>> I forgot one thing. The current pathspec behavior, the pattern "[a-z]"
>> would match a file named "[a-z]" (iow, wildcards are also considered
>> literal characters).
>
> That sounds like a blatant bug to me (unless you are talking about
> "literal" case). We should fix it before we include it in any
> released version, I think.
Ah, no, I misread what you meant.
Yes, historically when matching a path with a pattern "a/b[c-d]/e",
we tried to first literally match the path with it (this allows the
path "a/b[c-d]/e" to match with the pattern), and if it did not
match, used fnmatch(3) to allow "a/bc/e" to also match. This was an
ugly hack to cope with the possiblity that such a funny path is
actually used in projects.
With :(literal) magic and its friends you are working into the
system, the hack should disappear at the same time when you stop
running fnmatch(3) without FNM_PATHNAME. That should happen no
later than Git 2.0.
But until then, we should keep matching both paths "a/bc/e" and
"a/b[c-d]/e" with pattern "a/b[c-d]/e", for backward compatibility.
Sorry about the earlier confusion.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-01-29 16:27 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-29 4:35 [RFC] The design of new pathspec features Duy Nguyen
2013-01-29 5:05 ` Junio C Hamano
2013-01-29 5:27 ` Duy Nguyen
2013-01-29 5:07 ` Junio C Hamano
2013-01-29 5:33 ` Junio C Hamano
2013-01-29 6:13 ` Duy Nguyen
2013-01-29 6:31 ` Junio C Hamano
2013-01-29 12:17 ` Duy Nguyen
2013-01-29 16:20 ` Junio C Hamano
2013-01-29 16:27 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).