From: Stefan Beller <sbeller@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Duy Nguyen <pclouds@gmail.com>,
"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: [PATCHv7 5/5] pathspec: allow querying for attributes
Date: Wed, 18 May 2016 15:31:45 -0700 [thread overview]
Message-ID: <CAGZ79ka0Of6S3AGx24jrvso=AwMsxQOnWVFZA-XWy3590JbVgA@mail.gmail.com> (raw)
In-Reply-To: <xmqq7ferrvvd.fsf@gitster.mtv.corp.google.com>
On Wed, May 18, 2016 at 12:47 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> + - "`+`" the attribute must be set
>> + - "`-`" the attribute must be unset
>> + - "`~`" the attribute must be unspecified
Well the '-' is in line, at least.
So you're proposing empty string for set?
gitattributes documentation:
> Sometimes you would need to override an setting of an attribute for a
> path to Unspecified state. This can be done by listing the name of the
> attribute prefixed with an exclamation point !.
So the unspecified should be '!' to match Git consistency.
[I dislike the exclamation mark because of bash.
git ls-files ":(attr:!text)"
bash: !text: event not found
git ls-files ":(attr:\!text)"
fatal: attr spec '\!text': attrs must not start with '-' and be
composed of [-A-Za-z0-9_.].
# we error out because of the \ in there
git ls-files :\(attr:\!text\)
...
# that actually works.
]
>
> I think using these three, not the way how .gitattributes specify
> them, is highly confusing to the users.
Ok I'll align those.
>
>> + - "`?`" the attribute must not be unspecified, i.e. set, unset or value matches
>
> The choice of '?' is OK, as there is no reason .gitattributes wants
> to say something fuzzy like "this is set or unset or has value".
> The last part "set, unset or value matches" does not make sense,
> though. Did you mean "set, unset or set to value"?
Actually I meant "this is set or unset or has [any] value".
>
>> + - an empty "`[mode]`" matches if the attribute is set or has a value
>
> The same comment as +/-/~ above applies.
So consistency would ask for
- an empty "`[mode]`" matches if the attribute is set
- "`-`" the attribute must be unset
- "`!`" the attribute must be unspecified
- an empty "`[mode]`" combined with "`[=value]`" matches if the attribute has
the given value.
and those are not yet codified, but for discussion then:
- "`?`" the attribute must not be unspecified, i.e. set, unset or has any value
- "`+`" the attribute must be set or has any value
>> +
>> + match_mode = item->attr_match[i].match_mode;
>
> Mental note: there is a one-to-one correspondence between
> attr_check->check[] and attr_match[].
Right. I guess I should add a comment for that. Or better yet a test
with some weird pathspec making use of that
git ls-files :(attr:labelA !labelB +labelA)
That way we would see if there will be confusion between the modes
and attr names in the future.
>
>> + if (!matched)
>> + return 0;
>
> So this ANDs together. OK.
Sure, (it's clear to me, maybe I should document that as well),
because multiple pathspecs are OR.
git ls-files :(attr:labelA) :(attr:labelB)
>
> Mental note after skiming the caller:
>
> The "value" here is like "VAR1=VAL1 VAR2 !VAR3" in a pathspec
> ":(attr:VAR1=VAL1 VAR2 !VAR3)path/to/limit", i.e. the source string
> to be compiled into a list of conditionals that will be evaluated by
> match_attr() in dir.c
Right.
>> + string_list_split(&list, value, ' ', -1);
>> + string_list_remove_empty_items(&list, 0);
Right.
>
> At this point, we got "VAR1=VAL1", "VAR2", "!VAR3" as elements in
> this list.
Right.
>
>> + if (!item->attr_check)
>> + item->attr_check = git_attr_check_alloc();
>
> Given more than one "attr" magic, e.g. ":(attr:A=a,attr:B)/path",
> the check may not be empty when we process the second one; we just
> extend it without losing the existing contents.
That is why I am not super happy with it though.
":(attr:A=a,attr:B)/path",
":(attr:A=a B)/path",
are the same for the user as well as in the internal data structures.
This "wastes" the white space as a possible convenient separator
character, e.g. for multiple values. On the other hand it will be easier
to type, specially for many attrs (read submodule groups).
>
>> + ALLOC_GROW(item->attr_match, item->attr_match_nr + list.nr, item->attr_match_alloc);
>
> Likewise, we extend attr_match[] array.
>> +
>> + val_len = strcspn(val, "=,)");
>
> I understand "=", but can "," and ")" appear here?
This was overly caution from some intermediate state, where the caller
handed in more than required.
if (skip_prefix(copyfrom, "attr:", &body)) {
char *pass = xmemdupz(body, len - strlen("attr:"));
parse_pathspec_attr_match(item, pass);
free(pass);
continue;
}
When given ":(attr:VAR1=VAL1 VAR2 !VAR3)path/to/limit"
this only passes in "VAR1=VAL1 VAR2 !VAR3".
I contemplated write the caller as
if (skip_prefix(copyfrom, "attr:", &body)) {
parse_pathspec_attr_match(item, body);
continue;
}
but that would include further characters, i.e. ',' or ')'.
(but not a lot more)
I wanted to pass in an uncounted string though, as
string_list_separate doesn't offer a variant for counted
strings.
>
>> + if (val[val_len] == '=') {
>> + am->match_mode = MATCH_VALUE;
>> + am->value = xstrdup(&val[val_len + 1]);
>> + /*
>> + * NEEDSWORK:
>> + * Do we want to allow escaped commas to search
>> + * for comma separated values?
>> + */
>
> No, we don't ;-).
ok will drop comment, ...
>
>> + if (strchr(am->value, '\\'))
>> + die(_("attr spec values must not contain backslashes"));
>
> But this is a good escape hatch to reserve for our own use, until we
> decide what the quoting mechanism will be (or if it is necessary).
but keep the code here
>
>> + } else
>> + am->value = NULL;
>> +
>> + if (invalid_attr_name(val, val_len)) {
>> + am->match_mode = INVALID_ATTR;
>> + goto err;
>> + }
>> +
>> + am->attr = git_attr(xmemdupz(val, val_len));
>> + git_attr_check_append(item->attr_check, am->attr);
>
> GOOD!
except for the memory leak of xmemdupz. git_attr makes another
internal copy with FLEX_ALLOC_MEM. That ill be fixed once we use
git_attr_counted though.
>
> I think val and val_len should be renamed to attr and attr_len (in
> the VARIABLE=VALUE context, these two identifiers are about parsing
> the variable side, not the value side).
ok
>
>> + }
>> +
>> + string_list_clear(&list, 0);
>> + return;
>> +err:
>> + die(_("attr spec '%s': attrs must not start with '-' and "
>> + "be composed of [-A-Za-z0-9_.]."), value);
>
> What is "value" at this point? If you failed to parse the second
> element in "VAR1=VAL1 FR*TZ=VAL2 !VAR3", I think you would want to
> say "'FR*TZ' is malformed".
>
> Existing users of the function seems to say this:
>
> if (invalid_attr_name(cp, len)) {
> fprintf(stderr,
> "%.*s is not a valid attribute name: %s:%d\n",
> len, cp, src, lineno);
> return NULL;
> }
>
> when parsing .gitattribute file. The source file:line reference
> does not apply to this context, but it would be better to unify the
> message. I do not mind spelling the rules out explicitly like you
> did, but I do not want to see it spread across many places (which
> forces us to update them when we have to change the rule later).
>
> Perhaps we want a helper function in attr.c side that takes the
> attribute name string (val, val_len in your code above, which I
> suggest to be renamed to attr, attr_len) and formats the error
> message into a strbuf, or something?
That's why you added the reporting function to attrs, I see.
>
> Makes me wonder what "pass" stands for. From the look of xmemdupz(),
> given ":(attr:VAR1=VAL1 VAR2 !VAR3)path/to/limit" to parse, len
> points at ")" and xmemdupz() gives "VAR1=VAL1 VAR2 !VAR3" to it.
"what we _pass_ to the parse_pathspec_attr_match"... Maybe attr_body?
Thanks,
Stefan
next prev parent reply other threads:[~2016-05-18 22:31 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-18 19:02 [PATCHv7 0/5] pathspec magic extension to search for attributes Stefan Beller
2016-05-18 19:02 ` [PATCHv7 1/5] string list: improve comment Stefan Beller
2016-05-18 19:02 ` [PATCHv7 2/5] Documentation: fix a typo Stefan Beller
2016-05-18 19:02 ` [PATCHv7 3/5] pathspec: move long magic parsing out of prefix_pathspec Stefan Beller
2016-05-18 19:02 ` [PATCHv7 4/5] pathspec: move prefix check out of the inner loop Stefan Beller
2016-05-18 19:02 ` [PATCHv7 5/5] pathspec: allow querying for attributes Stefan Beller
2016-05-18 19:42 ` Stefan Beller
2016-05-18 19:47 ` Junio C Hamano
2016-05-18 20:00 ` Junio C Hamano
2016-05-18 20:21 ` Junio C Hamano
2016-05-18 21:01 ` [PATCH] attr: add counted string version of git_check_attr() Junio C Hamano
2016-05-18 21:03 ` [PATCH] attr: add counted string version of git_attr() Junio C Hamano
2016-05-18 21:36 ` Stefan Beller
2016-05-18 21:05 ` [PATCH] attr: expose validity check for attribute names Junio C Hamano
2016-05-18 21:42 ` Stefan Beller
2016-05-18 22:31 ` Stefan Beller [this message]
2016-05-18 22:59 ` [PATCHv7 5/5] pathspec: allow querying for attributes Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGZ79ka0Of6S3AGx24jrvso=AwMsxQOnWVFZA-XWy3590JbVgA@mail.gmail.com' \
--to=sbeller@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).