Git development
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Stefan Beller <sbeller@google.com>
Cc: pclouds@gmail.com, git@vger.kernel.org
Subject: Re: [RFC-PATCHv6 4/4] pathspec: allow querying for attributes
Date: Mon, 16 May 2016 22:03:11 -0700	[thread overview]
Message-ID: <xmqqvb2dxomo.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20160517031353.23707-5-sbeller@google.com> (Stefan Beller's message of "Mon, 16 May 2016 20:13:53 -0700")

Stefan Beller <sbeller@google.com> writes:

> +static struct git_attr_check *check;
> +static int match_attrs(const char *name, int namelen,
> +		       const struct pathspec_item *item)
> +{
> +	char *path;
> +	int i;
> +
> +	if (!check) {
> +		check = git_attr_check_alloc();
> +		for (i = 0; i < item->attr_nr; i++)
> +			git_attr_check_append(check, item->attrs[i].attr);

This is simply wrong; you may have two pathspec elements with
attribute match magic, the first one may ask for one attribute while
the second one may ask for seven.  The first time around you
allocate and append one attribute.  The second time around you don't
do anything useful, and send a git_attr_check with one element to
deal with 7 attributes.

> +	}
> +	path = xmemdupz(name, namelen);
> +	git_all_attrs(path, check);

However, the above "This is simply wrong" bogosity is covered
because git_all_attrs() is used here, ignoring what is in check.

The loop we see above is an expensive no-op, as the first thing
all_attrs() does is to empty check() and instead stuff it with every
attribute under the sun, not necessarily limited to attributes in
item->attrs[].

By the way, do not call an array as plural.  item->attr[i] is a good
name to call a single ith element in an array.  item->attrs[i] isn't.

> +	for (i = 0; i < item->attr_nr; i++) {
> +		int matched;
> +		const char *value = check->check[i].value;

check[i] has no relevance to item->attrs[i] here.  I do not think
the code after this point is computing anything sensible.

> diff --git a/pathspec.h b/pathspec.h
> index 0c11262..89d73db 100644
> --- a/pathspec.h
> +++ b/pathspec.h
> @@ -32,6 +32,21 @@ struct pathspec {
>  		int len, prefix;
>  		int nowildcard_len;
>  		int flags;
> +		int attr_nr;
> +		int attr_alloc;
> +		struct attr_item {
> +			char *attr;
> +			char *value;
> +			enum attr_match_mode {
> +				NOT_INIT,
> +				MATCH_SET,
> +				MATCH_UNSET,
> +				MATCH_VALUE,
> +				MATCH_UNSPECIFIED,
> +				MATCH_NOT_UNSPECIFIED,
> +				INVALID_ATTR
> +			} mode;
> +		} *attrs;

I'd think the above addition that is in line with the updated API
would look more like this [*1*]:

	int attr_match_nr;
        int attr_match_alloc;
        struct attr_match {
        	struct git_attr *attr;
                char *value;
                enum attr_match_mode {
                	...
		} match_mode;
	} *attr_match;
	struct git_attr_check *attr_check;

Then while parsing ":(attr:VAR1=VAL1 -VAR2 VAR3...)path/to/dir/",
you would first do:

	p->attr_check = git_attr_check_alloc();

once, and then send each of VAR1=VAL2, -VAR2, VAR3... to your
parse_one_item() helper function which would:

 * parse the match-mode like your code does;

 * parse out the attribute name (i.e. VAR1, VAR2 and VAR3), and
   instead of keeping it as a "(const) char *", call git_attr()
   to intern it (and keep it in local variable "attr"), and save
   it in p->attr_match[p->attr_nr].attr;

 * call git_attr_check_append(p->attr_check, git_attr_name(attr))

After the above finishes, you would end up with something like:

        .attr_match = {
            { .attr = git_attr("VAR1"), .value = "VAL2",
              .match_mode = MATCH_VALUE },
            { .attr = git_attr("VAR2"), .value = <does not matter>,
              .match_mode = MATCH_UNSET },
            ...
	},
        .attr_check = {
	    .check = {
            	{ .attr = git_attr("VAR1"), .value = <does not matter> },
            	{ .attr = git_attr("VAR2"), .value = <does not matter> },
            	{ .attr = git_attr("VAR3"), .value = <does not matter> },
            }
	    
When matching (i.e. the match_attrs() function), you would instead
do

	path = xmemdupz(name, namelen);
	git_check_attr(path, item->attr_check);

to grab values for only attributes that matter to you, instead of
calling git_all_attrs() [*2*].

After git_check_attr() returns, item->attr_check.check[0].attr would
be git_attr("VAR1") and item->attr_check.check[0].value would be
whatever setting the path has for the VAR1 attribute.  You can use
your match_mode logic to compare it with the values .attr_match
expects.

You do not necessarily have to have the same number of elements in
.attr_match and .attr_check.check by the way.  .attr_match might say

	VAR1=VAL2 !VAR1 -VAR1

which may be always false if these are ANDed together, but in order
to evaluate it, you need only one git_attr_check_elem for VAR1.


[Footnote]

*1* With the old API, things would not be that much different.
    Instead of single structure .attr_check, you would make an
    array of git_attr_check structure, exactly like the array
    at .attr_check.check[] in the new API by hand.  The new API
    makes this preparation simpler by managing the array on the API
    implementation side.

*2* Please do not use that silly function especially in performance
    sensitive codepath.

  parent reply	other threads:[~2016-05-17  5:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-17  3:13 [RFC-PATCHv6 0/4] pathspec attrs [WAS pathspec labels [WAS submodule groups]] Stefan Beller
2016-05-17  3:13 ` [RFC-PATCHv6 1/4] Documentation: fix a typo Stefan Beller
2016-05-17  3:13 ` [RFC-PATCHv6 2/4] pathspec: move long magic parsing out of prefix_pathspec Stefan Beller
2016-05-17  3:13 ` [RFC-PATCHv6 3/4] pathspec: move prefix check out of the inner loop Stefan Beller
2016-05-17  3:13 ` [RFC-PATCHv6 4/4] pathspec: allow querying for attributes Stefan Beller
2016-05-17  4:23   ` Junio C Hamano
2016-05-17 16:45     ` Stefan Beller
2016-05-17  5:03   ` Junio C Hamano [this message]
2016-05-17 17:03     ` Stefan Beller
2016-05-17 17:34       ` Junio C Hamano
2016-05-17 17:45         ` Stefan Beller
2016-05-17 18:05           ` Junio C Hamano
2016-05-17 18:10             ` Stefan Beller
2016-05-17 18:29               ` Junio C Hamano
2016-05-17 19:23     ` Stefan Beller
2016-05-17 20:25       ` Junio C Hamano
2016-05-18 15:39         ` Junio C Hamano
2016-05-17  4:14 ` [RFC-PATCHv6 0/4] pathspec attrs [WAS pathspec labels [WAS submodule groups]] Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqvb2dxomo.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox