git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-10-11 20:20 [PATCH 1/1] add: enable attr pathspec magic Joanna Wang
@ 2023-11-02  1:55 ` Joanna Wang
  2023-11-02  5:00   ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Joanna Wang @ 2023-11-02  1:55 UTC (permalink / raw)
  To: jojwang; +Cc: git, gitster

This lets users limit files or exclude files based on file
attributes during git-add and git-stash.
For example, the chromium project would like to use this like
"git add --all ':(exclude,attr:submodule)'", as submodules are managed in a
unique way and often results in submodule changes that users do not want in
their commits.

This does not change any attr magic implementation. It is only adding
attr as an allowed pathspec in git-add and git-stash, which was previously
blocked by GUARD_PATHSPEC and a pathspec mask in parse_pathspec()).
However, this does fix a bug in prefix_magic() where attr values were unintentionally removed.
This was hit whenever parse_pathspec() is called with PATHSPEC_PREFIX_ORIGIN as a flag,
which was the case for git-stash. More details here:
https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/

It is possible that when the attr pathspec feature was first added in
b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few GUARD_PATHSPEC() invocations.

Later, to get a more user-friendly error message when attr was used with git-add,
PATHSPEC_ATTR was added as a mask to git-add's invocation of parse_pathspec()
84d938b732 (add: do not accept pathspec magic 'attr', 2018-09-18).
However, this user-friendly error message was never added for git-stash.

Signed-off-by: Joanna Wang <jojwang@google.com>

---

PTAL.
I've updated this to include the fix for git-stash. I was initially going to fix this bug [1]
separately, but I thought it would make more sense to put everything in one patch so attr
could be enabled for both git-add and git-stash and tested.

[1] https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/



 builtin/add.c                  |   7 ++-
 dir.c                          |   3 +-
 pathspec.c                     |  11 +++-
 t/t6135-pathspec-with-attrs.sh | 108 +++++++++++++++++++++++++++++++--
 4 files changed, 118 insertions(+), 11 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5126d2ede3..d46e4d10e9 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 	 * Check the "pathspec '%s' did not match any files" block
 	 * below before enabling new magic.
 	 */
-	parse_pathspec(&pathspec, PATHSPEC_ATTR,
+	parse_pathspec(&pathspec, 0,
 		       PATHSPEC_PREFER_FULL |
 		       PATHSPEC_SYMLINK_LEADING_PATH,
 		       prefix, argv);
@@ -433,7 +433,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
 
-		parse_pathspec_file(&pathspec, PATHSPEC_ATTR,
+		parse_pathspec_file(&pathspec, 0,
 				    PATHSPEC_PREFER_FULL |
 				    PATHSPEC_SYMLINK_LEADING_PATH,
 				    prefix, pathspec_from_file, pathspec_file_nul);
@@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			       PATHSPEC_LITERAL |
 			       PATHSPEC_GLOB |
 			       PATHSPEC_ICASE |
-			       PATHSPEC_EXCLUDE);
+			       PATHSPEC_EXCLUDE |
+			       PATHSPEC_ATTR);
 
 		for (i = 0; i < pathspec.nr; i++) {
 			const char *path = pathspec.items[i].match;
diff --git a/dir.c b/dir.c
index 16fdb03f2a..4d1cd039be 100644
--- a/dir.c
+++ b/dir.c
@@ -2179,7 +2179,8 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
 		       PATHSPEC_LITERAL |
 		       PATHSPEC_GLOB |
 		       PATHSPEC_ICASE |
-		       PATHSPEC_EXCLUDE);
+		       PATHSPEC_EXCLUDE |
+		       PATHSPEC_ATTR);
 
 	for (i = 0; i < pathspec->nr; i++) {
 		const struct pathspec_item *item = &pathspec->items[i];
diff --git a/pathspec.c b/pathspec.c
index bb1efe1f39..40bd8a8819 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -109,7 +109,7 @@ static struct pathspec_magic {
 	{ PATHSPEC_ATTR,    '\0', "attr" },
 };
 
-static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
+static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic, const char *element)
 {
 	int i;
 	strbuf_addstr(sb, ":(");
@@ -118,6 +118,13 @@ static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
 			if (sb->buf[sb->len - 1] != '(')
 				strbuf_addch(sb, ',');
 			strbuf_addstr(sb, pathspec_magic[i].name);
+			if (pathspec_magic[i].bit & PATHSPEC_ATTR) {
+				/* extract and insert the attr magic value */
+				char* p = strstr(element, "attr:");
+				char buff[128];
+				sscanf(p, "attr%[^)|,]", buff);
+				strbuf_addstr(sb, buff);
+			}
 		}
 	strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
@@ -493,7 +500,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
 		struct strbuf sb = STRBUF_INIT;
 
 		/* Preserve the actual prefix length of each pattern */
-		prefix_magic(&sb, prefixlen, element_magic);
+		prefix_magic(&sb, prefixlen, element_magic, elt);
 
 		strbuf_addstr(&sb, match);
 		item->original = strbuf_detach(&sb, NULL);
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index f70c395e75..e46fa176ed 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,12 +64,24 @@ test_expect_success 'setup .gitattributes' '
 	fileSetLabel label
 	fileValue label=foo
 	fileWrongLabel label☺
+	newFileA* labelA
+	newFileB* labelB
 	EOF
 	echo fileSetLabel label1 >sub/.gitattributes &&
 	git add .gitattributes sub/.gitattributes &&
 	git commit -m "add attributes"
 '
 
+test_expect_success 'setup .gitignore' '
+	cat <<-\EOF >.gitignore &&
+	actual
+	expect
+	pathspec_file
+	EOF
+	git add .gitignore &&
+	git commit -m "add gitignore"
+'
+
 test_expect_success 'check specific set attr' '
 	cat <<-\EOF >expect &&
 	fileSetLabel
@@ -150,6 +162,7 @@ test_expect_success 'check specific value attr (2)' '
 test_expect_success 'check unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileA
 	fileAB
 	fileAC
@@ -175,6 +188,7 @@ test_expect_success 'check unspecified attr' '
 test_expect_success 'check unspecified attr (2)' '
 	cat <<-\EOF >expect &&
 	HEAD:.gitattributes
+	HEAD:.gitignore
 	HEAD:fileA
 	HEAD:fileAB
 	HEAD:fileAC
@@ -200,6 +214,7 @@ test_expect_success 'check unspecified attr (2)' '
 test_expect_success 'check multiple unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileC
 	fileNoLabel
 	fileWrongLabel
@@ -239,16 +254,99 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
 	test_i18ngrep "Only one" actual
 '
 
-test_expect_success 'fail if attr magic is used places not implemented' '
+test_expect_success 'fail if attr magic is used in places not implemented' '
 	# The main purpose of this test is to check that we actually fail
 	# when you attempt to use attr magic in commands that do not implement
-	# attr magic. This test does not advocate git-add to stay that way,
-	# though, but git-add is convenient as it has its own internal pathspec
-	# parsing.
-	test_must_fail git add ":(attr:labelB)" 2>actual &&
+	# attr magic. This test does not advocate check-ignore to stay that way.
+	# When you teach the command to grok the pathspec, you need to find
+	# another command to replace it for the test.
+	test_must_fail git check-ignore ":(attr:labelB)" 2>actual &&
 	test_i18ngrep "magic not supported" actual
 '
 
+test_expect_success 'check that attr magic works for git stash push' '
+	cat <<-\EOF >expect &&
+	A	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git stash push --include-untracked -- ":(exclude,attr:labelB)" &&
+	git stash show --include-untracked --name-status >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --all' '
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached >actual &&
+	git restore -W -S . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add -u' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add -u ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached  >actual &&
+	git restore -S -W . && rm sub/new* &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add <path>' '
+	cat <<-\EOF >expect &&
+	fileA
+	fileB
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add ":(exclude,attr:labelB)sub/*" &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git -add .' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	cd sub &&
+	git add . ":(exclude,attr:labelB)" &&
+	cd .. &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --pathspec-from-file' '
+	cat <<-\EOF >pathspec_file &&
+	:(exclude,attr:labelB)
+	EOF
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all --pathspec-from-file=pathspec_file &&
+	git diff --name-only --cached >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'abort on giving invalid label on the command line' '
 	test_must_fail git ls-files . ":(attr:☺)"
 '
-- 
2.42.0.820.g83a721a137-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-02  1:55 ` [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash Joanna Wang
@ 2023-11-02  5:00   ` Junio C Hamano
  2023-11-02 17:53     ` Joanna Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2023-11-02  5:00 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git

Joanna Wang <jojwang@google.com> writes:

> I've updated this to include the fix for git-stash. I was
> initially going to fix this bug [1] separately, but I thought it
> would make more sense to put everything in one patch so attr could
> be enabled for both git-add and git-stash and tested.

I think that is perfectly fine, as long as it is described well in
the proposed log message what is done in the patch, and how two
seemingly different things done in the patch are so closely linked
that it makes sense to do so in a single patch.

Will queue.

Thanks.

> diff --git a/pathspec.c b/pathspec.c
> index bb1efe1f39..40bd8a8819 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -109,7 +109,7 @@ static struct pathspec_magic {
>  	{ PATHSPEC_ATTR,    '\0', "attr" },
>  };
>  
> -static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
> +static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic, const char *element)
>  {
>  	int i;
>  	strbuf_addstr(sb, ":(");
> @@ -118,6 +118,13 @@ static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
>  			if (sb->buf[sb->len - 1] != '(')
>  				strbuf_addch(sb, ',');
>  			strbuf_addstr(sb, pathspec_magic[i].name);
> +			if (pathspec_magic[i].bit & PATHSPEC_ATTR) {
> +				/* extract and insert the attr magic value */
> +				char* p = strstr(element, "attr:");

In our codebase, the asterisk sticks to the variable, not to the
type, i.e.

				char *p = strstr(element, "attr:");

> +				char buff[128];

Will this fixed-size buffer make an inviting target for script
kiddies, as pathspec often come from the command line and under
control by whoever is making a request?

Aren't we going to copy what is in the elt literally from where
attr: appears up to the next delimiter like ',', ')'?  I am not sure
if we need a separate buffer.  Would something along this line work?

				strbuf_add(sb, p, strcspn(p, ",)");

I am unsure if this "unparsing" is the way we want to go.  For one,
just like %(attr:foo) can take an arbitrary end-user supplied string
in the "foo" part, we can have new pathspec magic in the future that
will take an arbitrary end-user supplied string as its value.  And the
above unparsing code will be utterly confused when that value
happens to contain "attr:" as its substring, e.g., 

    $ git add . ":(exclude,frotz:diattr:0,attr:submodule)"

Also, do we (and if not right now, then do we want to) support
giving more than one attribute?

    $ git add ":(attr:text,attr:small)*.py"

Not supporting multiple attributes would be OK for now, but when it
becomes needed, the "unparse using the bits in the element_magic"
does not look like the right approach.

Indeed, if you are going to pass the original "elt" string *anyway*,
you have all the info in there that you need.  I wonder if it makes
sense to get rid of the "unsigned magic" bitset from the parameter,
and discard the loop over the pathspec_magic[] array and do
something like:

    - if the original "elt" already has some magic (i.e., begins
      with ":(" and not in the literal mode), then copy them
      literally and textually from ":(" up to the closing ")", but
      also insert the necessary "prefix:<num>" magic;

    - otherwise, give ":(prefix:<num>)" magic.

without even attempting to unparse the bits in "element_magic"?

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-02  5:00   ` Junio C Hamano
@ 2023-11-02 17:53     ` Joanna Wang
  2023-11-02 21:32       ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Joanna Wang @ 2023-11-02 17:53 UTC (permalink / raw)
  To: gitster; +Cc: git, jojwang

This lets users limit files or exclude files based on file
attributes during git-add and git-stash.
For example, the chromium project would like to use this like
"git add --all ':(exclude,attr:submodule)'", as submodules are managed in a
unique way and often results in submodule changes that users do not want in
their commits.

This does not change any attr magic implementation. It is only adding
attr as an allowed pathspec in git-add and git-stash, which was previously
blocked by GUARD_PATHSPEC and a pathspec mask in parse_pathspec()).
However, this does fix a bug in prefix_magic() where attr values were unintentionally removed.
This was hit whenever parse_pathspec() is called with PATHSPEC_PREFIX_ORIGIN as a flag,
which was the case for git-stash. More details here:
https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/

It is possible that when the attr pathspec feature was first added in
b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few GUARD_PATHSPEC() invocations.

Later, to get a more user-friendly error message when attr was used with git-add,
PATHSPEC_ATTR was added as a mask to git-add's invocation of parse_pathspec()
84d938b732 (add: do not accept pathspec magic 'attr', 2018-09-18).
However, this user-friendly error message was never added for git-stash.

Signed-off-by: Joanna Wang <jojwang@google.com>

---

> Indeed, if you are going to pass the original "elt" string *anyway*,
> you have all the info in there that you need.  I wonder if it makes
> sense to get rid of the "unsigned magic" bitset from the parameter,
This was my initial strategy but ran into trouble when the magic was
in shorthand form. Upon closer look at how the shorthand works
(e.g. shorthand and longhand can never mix so
':!/(attr:chicken)file' would make <(attr:chicken)file> the match string)
I tried this again by processing the forms separately.
It would still need both the element and element_magic, but I think it
addresses the concerns with future changes where multiple magic match
values could be allowed and the match values could be any string. These
changes would be fine as long as there is no overlap between magic that
takes a user-supplied value and magic that can be expressed in shorthand.


 builtin/add.c                  |   7 ++-
 dir.c                          |   3 +-
 pathspec.c                     |  25 +++++---
 t/t6135-pathspec-with-attrs.sh | 108 +++++++++++++++++++++++++++++++--
 4 files changed, 125 insertions(+), 18 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5126d2ede3..d46e4d10e9 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 	 * Check the "pathspec '%s' did not match any files" block
 	 * below before enabling new magic.
 	 */
-	parse_pathspec(&pathspec, PATHSPEC_ATTR,
+	parse_pathspec(&pathspec, 0,
 		       PATHSPEC_PREFER_FULL |
 		       PATHSPEC_SYMLINK_LEADING_PATH,
 		       prefix, argv);
@@ -433,7 +433,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
 
-		parse_pathspec_file(&pathspec, PATHSPEC_ATTR,
+		parse_pathspec_file(&pathspec, 0,
 				    PATHSPEC_PREFER_FULL |
 				    PATHSPEC_SYMLINK_LEADING_PATH,
 				    prefix, pathspec_from_file, pathspec_file_nul);
@@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			       PATHSPEC_LITERAL |
 			       PATHSPEC_GLOB |
 			       PATHSPEC_ICASE |
-			       PATHSPEC_EXCLUDE);
+			       PATHSPEC_EXCLUDE |
+			       PATHSPEC_ATTR);
 
 		for (i = 0; i < pathspec.nr; i++) {
 			const char *path = pathspec.items[i].match;
diff --git a/dir.c b/dir.c
index 16fdb03f2a..4d1cd039be 100644
--- a/dir.c
+++ b/dir.c
@@ -2179,7 +2179,8 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
 		       PATHSPEC_LITERAL |
 		       PATHSPEC_GLOB |
 		       PATHSPEC_ICASE |
-		       PATHSPEC_EXCLUDE);
+		       PATHSPEC_EXCLUDE |
+		       PATHSPEC_ATTR);
 
 	for (i = 0; i < pathspec->nr; i++) {
 		const struct pathspec_item *item = &pathspec->items[i];
diff --git a/pathspec.c b/pathspec.c
index bb1efe1f39..588a2cde4d 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -109,16 +109,23 @@ static struct pathspec_magic {
 	{ PATHSPEC_ATTR,    '\0', "attr" },
 };
 
-static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
+static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic, const char *element)
 {
-	int i;
-	strbuf_addstr(sb, ":(");
-	for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
-		if (magic & pathspec_magic[i].bit) {
-			if (sb->buf[sb->len - 1] != '(')
-				strbuf_addch(sb, ',');
-			strbuf_addstr(sb, pathspec_magic[i].name);
+	if (element[1] != '(') {
+		/* Process an element in shorthand form (e.g. ":!/<match>") */
+		strbuf_addstr(sb, ":(");
+		for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+			if ((magic & pathspec_magic[i].bit) && (pathspec_magic[i].mnemonic != '\0')) {
+				if (sb->buf[sb->len - 1] != '(')
+					strbuf_addch(sb, ',');
+				strbuf_addstr(sb, pathspec_magic[i].name);
+			}
 		}
+	} else {
+		/* For an element in longhand form, we simply copy everything up to the final ')' */
+		int len = strchr(element, ')') - element;
+		strbuf_add(sb, element, len);
+	}
 	strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
@@ -493,7 +500,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
 		struct strbuf sb = STRBUF_INIT;
 
 		/* Preserve the actual prefix length of each pattern */
-		prefix_magic(&sb, prefixlen, element_magic);
+		prefix_magic(&sb, prefixlen, element_magic, elt);
 
 		strbuf_addstr(&sb, match);
 		item->original = strbuf_detach(&sb, NULL);
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index f70c395e75..e46fa176ed 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,12 +64,24 @@ test_expect_success 'setup .gitattributes' '
 	fileSetLabel label
 	fileValue label=foo
 	fileWrongLabel label☺
+	newFileA* labelA
+	newFileB* labelB
 	EOF
 	echo fileSetLabel label1 >sub/.gitattributes &&
 	git add .gitattributes sub/.gitattributes &&
 	git commit -m "add attributes"
 '
 
+test_expect_success 'setup .gitignore' '
+	cat <<-\EOF >.gitignore &&
+	actual
+	expect
+	pathspec_file
+	EOF
+	git add .gitignore &&
+	git commit -m "add gitignore"
+'
+
 test_expect_success 'check specific set attr' '
 	cat <<-\EOF >expect &&
 	fileSetLabel
@@ -150,6 +162,7 @@ test_expect_success 'check specific value attr (2)' '
 test_expect_success 'check unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileA
 	fileAB
 	fileAC
@@ -175,6 +188,7 @@ test_expect_success 'check unspecified attr' '
 test_expect_success 'check unspecified attr (2)' '
 	cat <<-\EOF >expect &&
 	HEAD:.gitattributes
+	HEAD:.gitignore
 	HEAD:fileA
 	HEAD:fileAB
 	HEAD:fileAC
@@ -200,6 +214,7 @@ test_expect_success 'check unspecified attr (2)' '
 test_expect_success 'check multiple unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileC
 	fileNoLabel
 	fileWrongLabel
@@ -239,16 +254,99 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
 	test_i18ngrep "Only one" actual
 '
 
-test_expect_success 'fail if attr magic is used places not implemented' '
+test_expect_success 'fail if attr magic is used in places not implemented' '
 	# The main purpose of this test is to check that we actually fail
 	# when you attempt to use attr magic in commands that do not implement
-	# attr magic. This test does not advocate git-add to stay that way,
-	# though, but git-add is convenient as it has its own internal pathspec
-	# parsing.
-	test_must_fail git add ":(attr:labelB)" 2>actual &&
+	# attr magic. This test does not advocate check-ignore to stay that way.
+	# When you teach the command to grok the pathspec, you need to find
+	# another command to replace it for the test.
+	test_must_fail git check-ignore ":(attr:labelB)" 2>actual &&
 	test_i18ngrep "magic not supported" actual
 '
 
+test_expect_success 'check that attr magic works for git stash push' '
+	cat <<-\EOF >expect &&
+	A	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git stash push --include-untracked -- ":(exclude,attr:labelB)" &&
+	git stash show --include-untracked --name-status >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --all' '
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached >actual &&
+	git restore -W -S . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add -u' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add -u ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached  >actual &&
+	git restore -S -W . && rm sub/new* &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add <path>' '
+	cat <<-\EOF >expect &&
+	fileA
+	fileB
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add ":(exclude,attr:labelB)sub/*" &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git -add .' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	cd sub &&
+	git add . ":(exclude,attr:labelB)" &&
+	cd .. &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --pathspec-from-file' '
+	cat <<-\EOF >pathspec_file &&
+	:(exclude,attr:labelB)
+	EOF
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all --pathspec-from-file=pathspec_file &&
+	git diff --name-only --cached >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'abort on giving invalid label on the command line' '
 	test_must_fail git ls-files . ":(attr:☺)"
 '
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-02 17:53     ` Joanna Wang
@ 2023-11-02 21:32       ` Junio C Hamano
  2023-11-02 23:45         ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2023-11-02 21:32 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git

Joanna Wang <jojwang@google.com> writes:

>> Indeed, if you are going to pass the original "elt" string *anyway*,
>> you have all the info in there that you need.  I wonder if it makes
>> sense to get rid of the "unsigned magic" bitset from the parameter,
> This was my initial strategy but ran into trouble when the magic was
> in shorthand form. Upon closer look at how the shorthand works
> (e.g. shorthand and longhand can never mix so
> ':!/(attr:chicken)file' would make <(attr:chicken)file> the match string)
> I tried this again by processing the forms separately.
> It would still need both the element and element_magic, but I think it
> addresses the concerns with future changes where multiple magic match
> values could be allowed and the match values could be any string.

The "bits" were acceptable for things like "exclude" and "icase"
because it does not matter how many times you gave them and they do
not take any additional parameters, but attr is different in that it
takes a value, and multiple instances with different values can be
given.  It is lucky that we did not allow mixing the short and long
forms ;-)

> These changes would be fine as long as there is no overlap between
> magic that takes a user-supplied value and magic that can be
> expressed in shorthand.

Indeed.  Thanks for thinking this through.

> -static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
> +static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic, const char *element)
>  {
> -	int i;
> -	strbuf_addstr(sb, ":(");
> -	for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
> -		if (magic & pathspec_magic[i].bit) {
> -			if (sb->buf[sb->len - 1] != '(')
> -				strbuf_addch(sb, ',');
> -			strbuf_addstr(sb, pathspec_magic[i].name);

At this point in the code, is it guaranteed that element[0] is ':'
and never a NUL?  Also is it guaranteed that element has ')'
somewhere later if element[1] is '('?

"Otherwise the caller would have failed to parse the pathspec magic
into the magic bits, and this helper function would not have been
called" is the answer I am expecting, but I didn't check if the
caller refrains from calling us.  It would be better to have a brief
comment explaining why a seemingly loose parsing of element[] string
is OK to save future readers from wondering the same thing as I did
here.

> +	if (element[1] != '(') {
> +		/* Process an element in shorthand form (e.g. ":!/<match>") */
> +		strbuf_addstr(sb, ":(");
> +		for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
> +			if ((magic & pathspec_magic[i].bit) && (pathspec_magic[i].mnemonic != '\0')) {
> +				if (sb->buf[sb->len - 1] != '(')
> +					strbuf_addch(sb, ',');
> +				strbuf_addstr(sb, pathspec_magic[i].name);
> +			}
>  		}
> +	} else {
> +		/* For an element in longhand form, we simply copy everything up to the final ')' */

A comment that is a bit on the overly-long side.

> +		int len = strchr(element, ')') - element;
> +		strbuf_add(sb, element, len);
> +	}
>  	strbuf_addf(sb, ",prefix:%d)", prefixlen);
>  }

Come to think of it, this part of the change could stand on its own
as an independent bugfix.  I wonder if this existing bug caused by
failing to copy the value of "attr:" is triggerable from a codepath
that already allows PATHSPEC_ATTR magic.  Not absolutely required
when the rest of the patch is in reasonably close to the finish
line, but it would narrow the scope of the new feature proper to
treat this as a separate and independent fix, on which the new
fature depends on.

Thanks for working on fixing this rather old bug.  I think we should
have noticed when we added the support for the "attr" magic to the
pathspec API.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-02 21:32       ` Junio C Hamano
@ 2023-11-02 23:45         ` Junio C Hamano
  2023-11-03 14:35           ` Joanna Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2023-11-02 23:45 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

>> +	} else {
>> +		/* For an element in longhand form, we simply copy everything up to the final ')' */
>
> A comment that is a bit on the overly-long side.
>
>> +		int len = strchr(element, ')') - element;
>> +		strbuf_add(sb, element, len);

In practice, nobody sane would write a pathspec magic that is over
2GB in size, so this would not matter unless we are facing a
potential attacker, but as the third parameter strbuf_add() takes is
of type size_t, it would not hurt to define "len" as the same type
as well.

> Thanks for working on fixing this rather old bug.  I think we should
> have noticed when we added the support for the "attr" magic to the
> pathspec API.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-02 23:45         ` Junio C Hamano
@ 2023-11-03 14:35           ` Joanna Wang
  2023-11-03 15:31             ` Ruben Safir
  2023-11-03 16:34             ` Joanna Wang
  0 siblings, 2 replies; 13+ messages in thread
From: Joanna Wang @ 2023-11-03 14:35 UTC (permalink / raw)
  To: gitster; +Cc: git, jojwang

This lets users limit files or exclude files based on file
attributes during git-add and git-stash.
For example, the chromium project would like to use this like
"git add --all ':(exclude,attr:submodule)'", as submodules are managed in a
unique way and often results in submodule changes that users do not want in
their commits.

This does not change any attr magic implementation. It is only adding
attr as an allowed pathspec in git-add and git-stash, which was previously
blocked by GUARD_PATHSPEC and a pathspec mask in parse_pathspec()).
However, this does fix a bug in prefix_magic() where attr values were unintentionally removed.
This was hit whenever parse_pathspec() is called with PATHSPEC_PREFIX_ORIGIN as a flag,
which was the case for git-stash.
(But originally filed here:
https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/)

Furthermore, while other commands hit this code path it did not result in unexpected
behavior because this bug only impacts the pathspec->items->original field which is
NOT used to filter paths. However, git-stash does use pathspec->items->original when
building args used to call other git commands.
(See add_pathspecs() usage and implementation in stash.c)

It is possible that when the attr pathspec feature was first added in
b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few GUARD_PATHSPEC() invocations.

Later, to get a more user-friendly error message when attr was used with git-add,
PATHSPEC_ATTR was added as a mask to git-add's invocation of parse_pathspec()
84d938b732 (add: do not accept pathspec magic 'attr', 2018-09-18).
However, this user-friendly error message was never added for git-stash.

Signed-off-by: Joanna Wang <jojwang@google.com>

---

> At this point in the code, is it guaranteed that element[0] is ':'
> and never a NUL?  Also is it guaranteed that element has ')'
> somewhere later if element[1] is '('?
No sorry, we can only assume this if there was element magic detected
by parse_element_magic(). So if element magic is not 0, we know the body
is in the expected form (either long or short).
I have added comments and a check for magic to guard against this.

>  I wonder if this existing bug caused by
> failing to copy the value of "attr:" is triggerable from a codepath
> that already allows PATHSPEC_ATTR magic.
While there are other commands that go through the prefix_magic path (like `git-add -i`),
AFAICT they are not actually impacted by the bug. The bug only impacts the value of
pathspec->items->original which is not used during pathspec matching. But git-stash
uses `original` to pass in as args to other commands it calls. I've included this
detail in the description above.

> but as the third parameter strbuf_add() takes is of type size_t, it would not
> hurt to define "len" as the same type as well.
Thanks for spotting. fixed.

 builtin/add.c                  |   7 ++-
 dir.c                          |   3 +-
 pathspec.c                     |  36 ++++++++---
 t/t6135-pathspec-with-attrs.sh | 108 +++++++++++++++++++++++++++++++--
 4 files changed, 136 insertions(+), 18 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5126d2ede3..d46e4d10e9 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 	 * Check the "pathspec '%s' did not match any files" block
 	 * below before enabling new magic.
 	 */
-	parse_pathspec(&pathspec, PATHSPEC_ATTR,
+	parse_pathspec(&pathspec, 0,
 		       PATHSPEC_PREFER_FULL |
 		       PATHSPEC_SYMLINK_LEADING_PATH,
 		       prefix, argv);
@@ -433,7 +433,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
 
-		parse_pathspec_file(&pathspec, PATHSPEC_ATTR,
+		parse_pathspec_file(&pathspec, 0,
 				    PATHSPEC_PREFER_FULL |
 				    PATHSPEC_SYMLINK_LEADING_PATH,
 				    prefix, pathspec_from_file, pathspec_file_nul);
@@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			       PATHSPEC_LITERAL |
 			       PATHSPEC_GLOB |
 			       PATHSPEC_ICASE |
-			       PATHSPEC_EXCLUDE);
+			       PATHSPEC_EXCLUDE |
+			       PATHSPEC_ATTR);
 
 		for (i = 0; i < pathspec.nr; i++) {
 			const char *path = pathspec.items[i].match;
diff --git a/dir.c b/dir.c
index 16fdb03f2a..4d1cd039be 100644
--- a/dir.c
+++ b/dir.c
@@ -2179,7 +2179,8 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
 		       PATHSPEC_LITERAL |
 		       PATHSPEC_GLOB |
 		       PATHSPEC_ICASE |
-		       PATHSPEC_EXCLUDE);
+		       PATHSPEC_EXCLUDE |
+		       PATHSPEC_ATTR);
 
 	for (i = 0; i < pathspec->nr; i++) {
 		const struct pathspec_item *item = &pathspec->items[i];
diff --git a/pathspec.c b/pathspec.c
index bb1efe1f39..2f8721cc15 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -109,16 +109,34 @@ static struct pathspec_magic {
 	{ PATHSPEC_ATTR,    '\0', "attr" },
 };
 
-static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
+static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic, const char *element)
 {
-	int i;
-	strbuf_addstr(sb, ":(");
-	for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
-		if (magic & pathspec_magic[i].bit) {
-			if (sb->buf[sb->len - 1] != '(')
-				strbuf_addch(sb, ',');
-			strbuf_addstr(sb, pathspec_magic[i].name);
+	/* No magic was found in element, just add prefix magic */
+	if (magic == 0) {
+		strbuf_addf(sb, ":(prefix:%d)", prefixlen);
+		return;
+	}
+
+	/*
+	 * At this point we known that parse_element_magic() was able to extract some pathspec
+	 * magic from element. So we know element is correctly formatted in either shorthand
+	 * or longhand form
+	 */
+	if (element[1] != '(') {
+		/* Process an element in shorthand form (e.g. ":!/<match>") */
+		strbuf_addstr(sb, ":(");
+		for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+			if ((magic & pathspec_magic[i].bit) && (pathspec_magic[i].mnemonic != '\0')) {
+				if (sb->buf[sb->len - 1] != '(')
+					strbuf_addch(sb, ',');
+				strbuf_addstr(sb, pathspec_magic[i].name);
+			}
 		}
+	} else {
+		/* For the longhand form, we copy everything up to the final ')' */
+		size_t len = strchr(element, ')') - element;
+		strbuf_add(sb, element, len);
+	}
 	strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
@@ -493,7 +511,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
 		struct strbuf sb = STRBUF_INIT;
 
 		/* Preserve the actual prefix length of each pattern */
-		prefix_magic(&sb, prefixlen, element_magic);
+		prefix_magic(&sb, prefixlen, element_magic, elt);
 
 		strbuf_addstr(&sb, match);
 		item->original = strbuf_detach(&sb, NULL);
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index f70c395e75..e46fa176ed 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,12 +64,24 @@ test_expect_success 'setup .gitattributes' '
 	fileSetLabel label
 	fileValue label=foo
 	fileWrongLabel label☺
+	newFileA* labelA
+	newFileB* labelB
 	EOF
 	echo fileSetLabel label1 >sub/.gitattributes &&
 	git add .gitattributes sub/.gitattributes &&
 	git commit -m "add attributes"
 '
 
+test_expect_success 'setup .gitignore' '
+	cat <<-\EOF >.gitignore &&
+	actual
+	expect
+	pathspec_file
+	EOF
+	git add .gitignore &&
+	git commit -m "add gitignore"
+'
+
 test_expect_success 'check specific set attr' '
 	cat <<-\EOF >expect &&
 	fileSetLabel
@@ -150,6 +162,7 @@ test_expect_success 'check specific value attr (2)' '
 test_expect_success 'check unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileA
 	fileAB
 	fileAC
@@ -175,6 +188,7 @@ test_expect_success 'check unspecified attr' '
 test_expect_success 'check unspecified attr (2)' '
 	cat <<-\EOF >expect &&
 	HEAD:.gitattributes
+	HEAD:.gitignore
 	HEAD:fileA
 	HEAD:fileAB
 	HEAD:fileAC
@@ -200,6 +214,7 @@ test_expect_success 'check unspecified attr (2)' '
 test_expect_success 'check multiple unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileC
 	fileNoLabel
 	fileWrongLabel
@@ -239,16 +254,99 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
 	test_i18ngrep "Only one" actual
 '
 
-test_expect_success 'fail if attr magic is used places not implemented' '
+test_expect_success 'fail if attr magic is used in places not implemented' '
 	# The main purpose of this test is to check that we actually fail
 	# when you attempt to use attr magic in commands that do not implement
-	# attr magic. This test does not advocate git-add to stay that way,
-	# though, but git-add is convenient as it has its own internal pathspec
-	# parsing.
-	test_must_fail git add ":(attr:labelB)" 2>actual &&
+	# attr magic. This test does not advocate check-ignore to stay that way.
+	# When you teach the command to grok the pathspec, you need to find
+	# another command to replace it for the test.
+	test_must_fail git check-ignore ":(attr:labelB)" 2>actual &&
 	test_i18ngrep "magic not supported" actual
 '
 
+test_expect_success 'check that attr magic works for git stash push' '
+	cat <<-\EOF >expect &&
+	A	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git stash push --include-untracked -- ":(exclude,attr:labelB)" &&
+	git stash show --include-untracked --name-status >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --all' '
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached >actual &&
+	git restore -W -S . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add -u' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add -u ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached  >actual &&
+	git restore -S -W . && rm sub/new* &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add <path>' '
+	cat <<-\EOF >expect &&
+	fileA
+	fileB
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add ":(exclude,attr:labelB)sub/*" &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git -add .' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	cd sub &&
+	git add . ":(exclude,attr:labelB)" &&
+	cd .. &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --pathspec-from-file' '
+	cat <<-\EOF >pathspec_file &&
+	:(exclude,attr:labelB)
+	EOF
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all --pathspec-from-file=pathspec_file &&
+	git diff --name-only --cached >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'abort on giving invalid label on the command line' '
 	test_must_fail git ls-files . ":(attr:☺)"
 '
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-03 14:35           ` Joanna Wang
@ 2023-11-03 15:31             ` Ruben Safir
  2023-11-03 16:34             ` Joanna Wang
  1 sibling, 0 replies; 13+ messages in thread
From: Ruben Safir @ 2023-11-03 15:31 UTC (permalink / raw)
  To: Joanna Wang; +Cc: gitster, git

Is that really you name or is it a bad joke?


On Fri, Nov 03, 2023 at 02:35:07PM +0000, Joanna Wang wrote:
> This lets users limit files or exclude files based on file
> attributes during git-add and git-stash.
> For example, the chromium project would like to use this like
> "git add --all ':(exclude,attr:submodule)'", as submodules are managed in a
> unique way and often results in submodule changes that users do not want in
> their commits.
> 
> This does not change any attr magic implementation. It is only adding
> attr as an allowed pathspec in git-add and git-stash, which was previously
> blocked by GUARD_PATHSPEC and a pathspec mask in parse_pathspec()).
> However, this does fix a bug in prefix_magic() where attr values were unintentionally removed.
> This was hit whenever parse_pathspec() is called with PATHSPEC_PREFIX_ORIGIN as a flag,
> which was the case for git-stash.
> (But originally filed here:
> https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/)
> 
> Furthermore, while other commands hit this code path it did not result in unexpected
> behavior because this bug only impacts the pathspec->items->original field which is
> NOT used to filter paths. However, git-stash does use pathspec->items->original when
> building args used to call other git commands.
> (See add_pathspecs() usage and implementation in stash.c)
> 
> It is possible that when the attr pathspec feature was first added in
> b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
> "PATHSPEC_ATTR" was just unintentionally left out of a few GUARD_PATHSPEC() invocations.
> 
> Later, to get a more user-friendly error message when attr was used with git-add,
> PATHSPEC_ATTR was added as a mask to git-add's invocation of parse_pathspec()
> 84d938b732 (add: do not accept pathspec magic 'attr', 2018-09-18).
> However, this user-friendly error message was never added for git-stash.
> 
> Signed-off-by: Joanna Wang <jojwang@google.com>
> 
> ---
> 
> > At this point in the code, is it guaranteed that element[0] is ':'
> > and never a NUL?  Also is it guaranteed that element has ')'
> > somewhere later if element[1] is '('?
> No sorry, we can only assume this if there was element magic detected
> by parse_element_magic(). So if element magic is not 0, we know the body
> is in the expected form (either long or short).
> I have added comments and a check for magic to guard against this.
> 
> >  I wonder if this existing bug caused by
> > failing to copy the value of "attr:" is triggerable from a codepath
> > that already allows PATHSPEC_ATTR magic.
> While there are other commands that go through the prefix_magic path (like `git-add -i`),
> AFAICT they are not actually impacted by the bug. The bug only impacts the value of
> pathspec->items->original which is not used during pathspec matching. But git-stash
> uses `original` to pass in as args to other commands it calls. I've included this
> detail in the description above.
> 
> > but as the third parameter strbuf_add() takes is of type size_t, it would not
> > hurt to define "len" as the same type as well.
> Thanks for spotting. fixed.
> 
>  builtin/add.c                  |   7 ++-
>  dir.c                          |   3 +-
>  pathspec.c                     |  36 ++++++++---
>  t/t6135-pathspec-with-attrs.sh | 108 +++++++++++++++++++++++++++++++--
>  4 files changed, 136 insertions(+), 18 deletions(-)
> 
> diff --git a/builtin/add.c b/builtin/add.c
> index 5126d2ede3..d46e4d10e9 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  	 * Check the "pathspec '%s' did not match any files" block
>  	 * below before enabling new magic.
>  	 */
> -	parse_pathspec(&pathspec, PATHSPEC_ATTR,
> +	parse_pathspec(&pathspec, 0,
>  		       PATHSPEC_PREFER_FULL |
>  		       PATHSPEC_SYMLINK_LEADING_PATH,
>  		       prefix, argv);
> @@ -433,7 +433,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  		if (pathspec.nr)
>  			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
>  
> -		parse_pathspec_file(&pathspec, PATHSPEC_ATTR,
> +		parse_pathspec_file(&pathspec, 0,
>  				    PATHSPEC_PREFER_FULL |
>  				    PATHSPEC_SYMLINK_LEADING_PATH,
>  				    prefix, pathspec_from_file, pathspec_file_nul);
> @@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  			       PATHSPEC_LITERAL |
>  			       PATHSPEC_GLOB |
>  			       PATHSPEC_ICASE |
> -			       PATHSPEC_EXCLUDE);
> +			       PATHSPEC_EXCLUDE |
> +			       PATHSPEC_ATTR);
>  
>  		for (i = 0; i < pathspec.nr; i++) {
>  			const char *path = pathspec.items[i].match;
> diff --git a/dir.c b/dir.c
> index 16fdb03f2a..4d1cd039be 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -2179,7 +2179,8 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
>  		       PATHSPEC_LITERAL |
>  		       PATHSPEC_GLOB |
>  		       PATHSPEC_ICASE |
> -		       PATHSPEC_EXCLUDE);
> +		       PATHSPEC_EXCLUDE |
> +		       PATHSPEC_ATTR);
>  
>  	for (i = 0; i < pathspec->nr; i++) {
>  		const struct pathspec_item *item = &pathspec->items[i];
> diff --git a/pathspec.c b/pathspec.c
> index bb1efe1f39..2f8721cc15 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -109,16 +109,34 @@ static struct pathspec_magic {
>  	{ PATHSPEC_ATTR,    '\0', "attr" },
>  };
>  
> -static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
> +static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic, const char *element)
>  {
> -	int i;
> -	strbuf_addstr(sb, ":(");
> -	for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
> -		if (magic & pathspec_magic[i].bit) {
> -			if (sb->buf[sb->len - 1] != '(')
> -				strbuf_addch(sb, ',');
> -			strbuf_addstr(sb, pathspec_magic[i].name);
> +	/* No magic was found in element, just add prefix magic */
> +	if (magic == 0) {
> +		strbuf_addf(sb, ":(prefix:%d)", prefixlen);
> +		return;
> +	}
> +
> +	/*
> +	 * At this point we known that parse_element_magic() was able to extract some pathspec
> +	 * magic from element. So we know element is correctly formatted in either shorthand
> +	 * or longhand form
> +	 */
> +	if (element[1] != '(') {
> +		/* Process an element in shorthand form (e.g. ":!/<match>") */
> +		strbuf_addstr(sb, ":(");
> +		for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
> +			if ((magic & pathspec_magic[i].bit) && (pathspec_magic[i].mnemonic != '\0')) {
> +				if (sb->buf[sb->len - 1] != '(')
> +					strbuf_addch(sb, ',');
> +				strbuf_addstr(sb, pathspec_magic[i].name);
> +			}
>  		}
> +	} else {
> +		/* For the longhand form, we copy everything up to the final ')' */
> +		size_t len = strchr(element, ')') - element;
> +		strbuf_add(sb, element, len);
> +	}
>  	strbuf_addf(sb, ",prefix:%d)", prefixlen);
>  }
>  
> @@ -493,7 +511,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
>  		struct strbuf sb = STRBUF_INIT;
>  
>  		/* Preserve the actual prefix length of each pattern */
> -		prefix_magic(&sb, prefixlen, element_magic);
> +		prefix_magic(&sb, prefixlen, element_magic, elt);
>  
>  		strbuf_addstr(&sb, match);
>  		item->original = strbuf_detach(&sb, NULL);
> diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
> index f70c395e75..e46fa176ed 100755
> --- a/t/t6135-pathspec-with-attrs.sh
> +++ b/t/t6135-pathspec-with-attrs.sh
> @@ -64,12 +64,24 @@ test_expect_success 'setup .gitattributes' '
>  	fileSetLabel label
>  	fileValue label=foo
>  	fileWrongLabel label☺
> +	newFileA* labelA
> +	newFileB* labelB
>  	EOF
>  	echo fileSetLabel label1 >sub/.gitattributes &&
>  	git add .gitattributes sub/.gitattributes &&
>  	git commit -m "add attributes"
>  '
>  
> +test_expect_success 'setup .gitignore' '
> +	cat <<-\EOF >.gitignore &&
> +	actual
> +	expect
> +	pathspec_file
> +	EOF
> +	git add .gitignore &&
> +	git commit -m "add gitignore"
> +'
> +
>  test_expect_success 'check specific set attr' '
>  	cat <<-\EOF >expect &&
>  	fileSetLabel
> @@ -150,6 +162,7 @@ test_expect_success 'check specific value attr (2)' '
>  test_expect_success 'check unspecified attr' '
>  	cat <<-\EOF >expect &&
>  	.gitattributes
> +	.gitignore
>  	fileA
>  	fileAB
>  	fileAC
> @@ -175,6 +188,7 @@ test_expect_success 'check unspecified attr' '
>  test_expect_success 'check unspecified attr (2)' '
>  	cat <<-\EOF >expect &&
>  	HEAD:.gitattributes
> +	HEAD:.gitignore
>  	HEAD:fileA
>  	HEAD:fileAB
>  	HEAD:fileAC
> @@ -200,6 +214,7 @@ test_expect_success 'check unspecified attr (2)' '
>  test_expect_success 'check multiple unspecified attr' '
>  	cat <<-\EOF >expect &&
>  	.gitattributes
> +	.gitignore
>  	fileC
>  	fileNoLabel
>  	fileWrongLabel
> @@ -239,16 +254,99 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
>  	test_i18ngrep "Only one" actual
>  '
>  
> -test_expect_success 'fail if attr magic is used places not implemented' '
> +test_expect_success 'fail if attr magic is used in places not implemented' '
>  	# The main purpose of this test is to check that we actually fail
>  	# when you attempt to use attr magic in commands that do not implement
> -	# attr magic. This test does not advocate git-add to stay that way,
> -	# though, but git-add is convenient as it has its own internal pathspec
> -	# parsing.
> -	test_must_fail git add ":(attr:labelB)" 2>actual &&
> +	# attr magic. This test does not advocate check-ignore to stay that way.
> +	# When you teach the command to grok the pathspec, you need to find
> +	# another command to replace it for the test.
> +	test_must_fail git check-ignore ":(attr:labelB)" 2>actual &&
>  	test_i18ngrep "magic not supported" actual
>  '
>  
> +test_expect_success 'check that attr magic works for git stash push' '
> +	cat <<-\EOF >expect &&
> +	A	sub/newFileA-foo
> +	EOF
> +	>sub/newFileA-foo &&
> +	>sub/newFileB-foo &&
> +	git stash push --include-untracked -- ":(exclude,attr:labelB)" &&
> +	git stash show --include-untracked --name-status >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check that attr magic works for git add --all' '
> +	cat <<-\EOF >expect &&
> +	sub/newFileA-foo
> +	EOF
> +	>sub/newFileA-foo &&
> +	>sub/newFileB-foo &&
> +	git add --all ":(exclude,attr:labelB)" &&
> +	git diff --name-only --cached >actual &&
> +	git restore -W -S . &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check that attr magic works for git add -u' '
> +	cat <<-\EOF >expect &&
> +	sub/fileA
> +	EOF
> +	>sub/newFileA-foo &&
> +	>sub/newFileB-foo &&
> +	>sub/fileA &&
> +	>sub/fileB &&
> +	git add -u ":(exclude,attr:labelB)" &&
> +	git diff --name-only --cached  >actual &&
> +	git restore -S -W . && rm sub/new* &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check that attr magic works for git add <path>' '
> +	cat <<-\EOF >expect &&
> +	fileA
> +	fileB
> +	sub/fileA
> +	EOF
> +	>fileA &&
> +	>fileB &&
> +	>sub/fileA &&
> +	>sub/fileB &&
> +	git add ":(exclude,attr:labelB)sub/*" &&
> +	git diff --name-only --cached >actual &&
> +	git restore -S -W . &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check that attr magic works for git -add .' '
> +	cat <<-\EOF >expect &&
> +	sub/fileA
> +	EOF
> +	>fileA &&
> +	>fileB &&
> +	>sub/fileA &&
> +	>sub/fileB &&
> +	cd sub &&
> +	git add . ":(exclude,attr:labelB)" &&
> +	cd .. &&
> +	git diff --name-only --cached >actual &&
> +	git restore -S -W . &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check that attr magic works for git add --pathspec-from-file' '
> +	cat <<-\EOF >pathspec_file &&
> +	:(exclude,attr:labelB)
> +	EOF
> +	cat <<-\EOF >expect &&
> +	sub/newFileA-foo
> +	EOF
> +	>sub/newFileA-foo &&
> +	>sub/newFileB-foo &&
> +	git add --all --pathspec-from-file=pathspec_file &&
> +	git diff --name-only --cached >actual &&
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'abort on giving invalid label on the command line' '
>  	test_must_fail git ls-files . ":(attr:☺)"
>  '
> -- 
> 2.42.0.869.gea05f2083d-goog
> 

-- 
So many immigrant groups have swept through our town
that Brooklyn, like Atlantis, reaches mythological
proportions in the mind of the world - RI Safir 1998
http://www.mrbrklyn.com 

DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
http://www.nylxs.com - Leadership Development in Free Software
http://www2.mrbrklyn.com/resources - Unpublished Archive 
http://www.coinhangout.com - coins!
http://www.brooklyn-living.com 

Being so tracked is for FARM ANIMALS and extermination camps, 
but incompatible with living as a free human being. -RI Safir 2013


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-03 14:35           ` Joanna Wang
  2023-11-03 15:31             ` Ruben Safir
@ 2023-11-03 16:34             ` Joanna Wang
  2023-11-04  5:11               ` Junio C Hamano
  1 sibling, 1 reply; 13+ messages in thread
From: Joanna Wang @ 2023-11-03 16:34 UTC (permalink / raw)
  To: jojwang; +Cc: git, gitster

This lets users limit files or exclude files based on file
attributes during git-add and git-stash.
For example, the chromium project would like to use this like
"git add --all ':(exclude,attr:submodule)'", as submodules are managed in a
unique way and often results in submodule changes that users do not want in
their commits.

This does not change any attr magic implementation. It is only adding
attr as an allowed pathspec in git-add and git-stash, which was previously
blocked by GUARD_PATHSPEC and a pathspec mask in parse_pathspec()).
However, this does fix a bug in prefix_magic() where attr values were unintentionally removed.
This was hit whenever parse_pathspec() is called with PATHSPEC_PREFIX_ORIGIN as a flag,
which was the case for git-stash.
(Bug originally filed here:
https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/)

Furthermore, while other commands hit this code path it did not result in unexpected
behavior because this bug only impacts the pathspec->items->original field which is
NOT used to filter paths. However, git-stash does use pathspec->items->original when
building args used to call other git commands.
(See add_pathspecs() usage and implementation in stash.c)

It is possible that when the attr pathspec feature was first added in
b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few GUARD_PATHSPEC() invocations.

Later, to get a more user-friendly error message when attr was used with git-add,
PATHSPEC_ATTR was added as a mask to git-add's invocation of parse_pathspec()
84d938b732 (add: do not accept pathspec magic 'attr', 2018-09-18).
However, this user-friendly error message was never added for git-stash.

Signed-off-by: Joanna Wang <jojwang@google.com>

---

I fixed a typo above. "(But originally" -> "(Bug originally"

> So if element magic is not 0, we know the body is in the expected
> form (either long or short).
Also while this is still true AFAICT, I found an existing bug in parse_short_magic()
where it does not handle paths that start with ":" where ":" is part of the
file name. e.g. "git stash push -- ':chicken'" is not handled correcly. But I feel
this is a seperate topic from this patch. So I'll file a bug to address it seperately.

 builtin/add.c                  |   7 ++-
 dir.c                          |   3 +-
 pathspec.c                     |  36 ++++++++---
 t/t6135-pathspec-with-attrs.sh | 108 +++++++++++++++++++++++++++++++--
 4 files changed, 136 insertions(+), 18 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5126d2ede3..d46e4d10e9 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 	 * Check the "pathspec '%s' did not match any files" block
 	 * below before enabling new magic.
 	 */
-	parse_pathspec(&pathspec, PATHSPEC_ATTR,
+	parse_pathspec(&pathspec, 0,
 		       PATHSPEC_PREFER_FULL |
 		       PATHSPEC_SYMLINK_LEADING_PATH,
 		       prefix, argv);
@@ -433,7 +433,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
 
-		parse_pathspec_file(&pathspec, PATHSPEC_ATTR,
+		parse_pathspec_file(&pathspec, 0,
 				    PATHSPEC_PREFER_FULL |
 				    PATHSPEC_SYMLINK_LEADING_PATH,
 				    prefix, pathspec_from_file, pathspec_file_nul);
@@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			       PATHSPEC_LITERAL |
 			       PATHSPEC_GLOB |
 			       PATHSPEC_ICASE |
-			       PATHSPEC_EXCLUDE);
+			       PATHSPEC_EXCLUDE |
+			       PATHSPEC_ATTR);
 
 		for (i = 0; i < pathspec.nr; i++) {
 			const char *path = pathspec.items[i].match;
diff --git a/dir.c b/dir.c
index 16fdb03f2a..4d1cd039be 100644
--- a/dir.c
+++ b/dir.c
@@ -2179,7 +2179,8 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
 		       PATHSPEC_LITERAL |
 		       PATHSPEC_GLOB |
 		       PATHSPEC_ICASE |
-		       PATHSPEC_EXCLUDE);
+		       PATHSPEC_EXCLUDE |
+		       PATHSPEC_ATTR);
 
 	for (i = 0; i < pathspec->nr; i++) {
 		const struct pathspec_item *item = &pathspec->items[i];
diff --git a/pathspec.c b/pathspec.c
index bb1efe1f39..2f8721cc15 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -109,16 +109,34 @@ static struct pathspec_magic {
 	{ PATHSPEC_ATTR,    '\0', "attr" },
 };
 
-static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
+static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic, const char *element)
 {
-	int i;
-	strbuf_addstr(sb, ":(");
-	for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
-		if (magic & pathspec_magic[i].bit) {
-			if (sb->buf[sb->len - 1] != '(')
-				strbuf_addch(sb, ',');
-			strbuf_addstr(sb, pathspec_magic[i].name);
+	/* No magic was found in element, just add prefix magic */
+	if (magic == 0) {
+		strbuf_addf(sb, ":(prefix:%d)", prefixlen);
+		return;
+	}
+
+	/*
+	 * At this point we known that parse_element_magic() was able to extract some pathspec
+	 * magic from element. So we know element is correctly formatted in either shorthand
+	 * or longhand form
+	 */
+	if (element[1] != '(') {
+		/* Process an element in shorthand form (e.g. ":!/<match>") */
+		strbuf_addstr(sb, ":(");
+		for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+			if ((magic & pathspec_magic[i].bit) && (pathspec_magic[i].mnemonic != '\0')) {
+				if (sb->buf[sb->len - 1] != '(')
+					strbuf_addch(sb, ',');
+				strbuf_addstr(sb, pathspec_magic[i].name);
+			}
 		}
+	} else {
+		/* For the longhand form, we copy everything up to the final ')' */
+		size_t len = strchr(element, ')') - element;
+		strbuf_add(sb, element, len);
+	}
 	strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
@@ -493,7 +511,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
 		struct strbuf sb = STRBUF_INIT;
 
 		/* Preserve the actual prefix length of each pattern */
-		prefix_magic(&sb, prefixlen, element_magic);
+		prefix_magic(&sb, prefixlen, element_magic, elt);
 
 		strbuf_addstr(&sb, match);
 		item->original = strbuf_detach(&sb, NULL);
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index f70c395e75..e46fa176ed 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,12 +64,24 @@ test_expect_success 'setup .gitattributes' '
 	fileSetLabel label
 	fileValue label=foo
 	fileWrongLabel label☺
+	newFileA* labelA
+	newFileB* labelB
 	EOF
 	echo fileSetLabel label1 >sub/.gitattributes &&
 	git add .gitattributes sub/.gitattributes &&
 	git commit -m "add attributes"
 '
 
+test_expect_success 'setup .gitignore' '
+	cat <<-\EOF >.gitignore &&
+	actual
+	expect
+	pathspec_file
+	EOF
+	git add .gitignore &&
+	git commit -m "add gitignore"
+'
+
 test_expect_success 'check specific set attr' '
 	cat <<-\EOF >expect &&
 	fileSetLabel
@@ -150,6 +162,7 @@ test_expect_success 'check specific value attr (2)' '
 test_expect_success 'check unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileA
 	fileAB
 	fileAC
@@ -175,6 +188,7 @@ test_expect_success 'check unspecified attr' '
 test_expect_success 'check unspecified attr (2)' '
 	cat <<-\EOF >expect &&
 	HEAD:.gitattributes
+	HEAD:.gitignore
 	HEAD:fileA
 	HEAD:fileAB
 	HEAD:fileAC
@@ -200,6 +214,7 @@ test_expect_success 'check unspecified attr (2)' '
 test_expect_success 'check multiple unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileC
 	fileNoLabel
 	fileWrongLabel
@@ -239,16 +254,99 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
 	test_i18ngrep "Only one" actual
 '
 
-test_expect_success 'fail if attr magic is used places not implemented' '
+test_expect_success 'fail if attr magic is used in places not implemented' '
 	# The main purpose of this test is to check that we actually fail
 	# when you attempt to use attr magic in commands that do not implement
-	# attr magic. This test does not advocate git-add to stay that way,
-	# though, but git-add is convenient as it has its own internal pathspec
-	# parsing.
-	test_must_fail git add ":(attr:labelB)" 2>actual &&
+	# attr magic. This test does not advocate check-ignore to stay that way.
+	# When you teach the command to grok the pathspec, you need to find
+	# another command to replace it for the test.
+	test_must_fail git check-ignore ":(attr:labelB)" 2>actual &&
 	test_i18ngrep "magic not supported" actual
 '
 
+test_expect_success 'check that attr magic works for git stash push' '
+	cat <<-\EOF >expect &&
+	A	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git stash push --include-untracked -- ":(exclude,attr:labelB)" &&
+	git stash show --include-untracked --name-status >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --all' '
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached >actual &&
+	git restore -W -S . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add -u' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add -u ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached  >actual &&
+	git restore -S -W . && rm sub/new* &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add <path>' '
+	cat <<-\EOF >expect &&
+	fileA
+	fileB
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add ":(exclude,attr:labelB)sub/*" &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git -add .' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	cd sub &&
+	git add . ":(exclude,attr:labelB)" &&
+	cd .. &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --pathspec-from-file' '
+	cat <<-\EOF >pathspec_file &&
+	:(exclude,attr:labelB)
+	EOF
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all --pathspec-from-file=pathspec_file &&
+	git diff --name-only --cached >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'abort on giving invalid label on the command line' '
 	test_must_fail git ls-files . ":(attr:☺)"
 '
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-03 16:34             ` Joanna Wang
@ 2023-11-04  5:11               ` Junio C Hamano
  2023-11-04  6:28                 ` Eric Sunshine
  2023-11-09 23:27                 ` Junio C Hamano
  0 siblings, 2 replies; 13+ messages in thread
From: Junio C Hamano @ 2023-11-04  5:11 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git

Thanks.  The compiled result from this version looks quite good.  As
you started a new round with typofix, let me start the final
nitpicking.

Joanna Wang <jojwang@google.com> writes:

> This lets users limit files or exclude files based on file
> attributes during git-add and git-stash.
> For example, the chromium project would like to use this like
> "git add --all ':(exclude,attr:submodule)'", as submodules are managed in a
> unique way and often results in submodule changes that users do not want in
> their commits.

I would say

    Allow users to limit or exclude files based on file attributes
    during git-add and git-stash.

    For example, the chromium project would like to use 

        $ git add . ':(exclude,attr:submodule)'

    as submodules are managed in a unique way and often results in
    submodule changes that users do not want in their commits.

except that I would prefer to see "in a unique way" rephrased.
Would "are managed outside Git using external tools" be a more
accurate description?

It may be true that the chromium project handles submodules unlike
all the other projects, but you guys being alone is not an important
part of the reason why we might want this new feature to help users.
When another project adopts the way how chromium manages the
submodules, forbidding its end-users from running "git add" to
record the changes to their submodules, you guys will no longer be
"unique", but the need for this feature will still exist (and the
demand may even be stronger).  The reason why this feature helps
should be stated here.

> This does not change any attr magic implementation. It is only adding
> attr as an allowed pathspec in git-add and git-stash, which was previously
> blocked by GUARD_PATHSPEC and a pathspec mask in parse_pathspec()).

Add blank line here between paragraphs.

> However, this does fix a bug in prefix_magic() where attr values were unintentionally removed.
> This was hit whenever parse_pathspec() is called with PATHSPEC_PREFIX_ORIGIN as a flag,
> which was the case for git-stash.
> (Bug originally filed here:

Wrap overlong lines here and in the rest of the proposed log message.

> diff --git a/pathspec.c b/pathspec.c
> index bb1efe1f39..2f8721cc15 100644
> --- a/pathspec.c
> +++ b/pathspec.c
> @@ -109,16 +109,34 @@ static struct pathspec_magic {
>  	{ PATHSPEC_ATTR,    '\0', "attr" },
>  };
>  
> -static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
> +static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic, const char *element)
>  {

This is overly long.

> +	/* No magic was found in element, just add prefix magic */
> +	if (magic == 0) {

Style:

	if (!magic) {

> +		strbuf_addf(sb, ":(prefix:%d)", prefixlen);
> +		return;
> +	}
> +

> +	/*
> +	 * At this point we known that parse_element_magic() was able to extract some pathspec
> +	 * magic from element. So we know element is correctly formatted in either shorthand
> +	 * or longhand form
> +	 */

Overly long, with a typo.  "At this point, we know that ...".

> +	if (element[1] != '(') {
> +		/* Process an element in shorthand form (e.g. ":!/<match>") */
> +		strbuf_addstr(sb, ":(");
> +		for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
> +			if ((magic & pathspec_magic[i].bit) && (pathspec_magic[i].mnemonic != '\0')) {

Overly long, with a stylo.

> +				if (sb->buf[sb->len - 1] != '(')
> +					strbuf_addch(sb, ',');
> +				strbuf_addstr(sb, pathspec_magic[i].name);
> +			}
>  		}
> +	} else {
> +		/* For the longhand form, we copy everything up to the final ')' */
> +		size_t len = strchr(element, ')') - element;
> +		strbuf_add(sb, element, len);
> +	}
>  	strbuf_addf(sb, ",prefix:%d)", prefixlen);
>  }

Here is what I ended up queuing.

Thanks.

---- >8 -------- >8 -------- >8 -------- >8 ----
From: Joanna Wang <jojwang@google.com>
Date: Fri, 3 Nov 2023 16:34:48 +0000
Subject: [PATCH] attr: enable attr pathspec magic for git-add and git-stash

Allow users to limit or exclude files based on file attributes
during git-add and git-stash.

For example, the chromium project would like to use

    $ git add . ':(exclude,attr:submodule)'

as submodules are managed by an external tool, forbidding end users
to record changes with "git add".  Allowing "git add" to often
records changes that users do not want in their commits.

This commit does not change any attr magic implementation. It is
only adding attr as an allowed pathspec in git-add and git-stash,
which was previously blocked by GUARD_PATHSPEC and a pathspec mask
in parse_pathspec()).

However, we fix a bug in prefix_magic() where attr values were
unintentionally removed.  This was triggerable when parse_pathspec()
is called with PATHSPEC_PREFIX_ORIGIN as a flag, which was the case
for git-stash (Bug originally filed here [*])

Furthermore, while other commands hit this code path it did not
result in unexpected behavior because this bug only impacts the
pathspec->items->original field which is NOT used to filter
paths. However, git-stash does use pathspec->items->original when
building args used to call other git commands.  (See add_pathspecs()
usage and implementation in stash.c)

It is possible that when the attr pathspec feature was first added
in b0db704652 (pathspec: allow querying for attributes, 2017-03-13),
"PATHSPEC_ATTR" was just unintentionally left out of a few
GUARD_PATHSPEC() invocations.

Later, to get a more user-friendly error message when attr was used
with git-add, PATHSPEC_ATTR was added as a mask to git-add's
invocation of parse_pathspec() 84d938b732 (add: do not accept
pathspec magic 'attr', 2018-09-18).  However, this user-friendly
error message was never added for git-stash.

[Reference]
 * https://lore.kernel.org/git/CAMmZTi-0QKtj7Q=sbC5qhipGsQxJFOY-Qkk1jfkRYwfF5FcUVg@mail.gmail.com/)

Signed-off-by: Joanna Wang <jojwang@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/add.c                  |   7 ++-
 dir.c                          |   3 +-
 pathspec.c                     |  39 +++++++++---
 t/t6135-pathspec-with-attrs.sh | 108 +++++++++++++++++++++++++++++++--
 4 files changed, 139 insertions(+), 18 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5126d2ede3..d46e4d10e9 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -424,7 +424,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 	 * Check the "pathspec '%s' did not match any files" block
 	 * below before enabling new magic.
 	 */
-	parse_pathspec(&pathspec, PATHSPEC_ATTR,
+	parse_pathspec(&pathspec, 0,
 		       PATHSPEC_PREFER_FULL |
 		       PATHSPEC_SYMLINK_LEADING_PATH,
 		       prefix, argv);
@@ -433,7 +433,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
 
-		parse_pathspec_file(&pathspec, PATHSPEC_ATTR,
+		parse_pathspec_file(&pathspec, 0,
 				    PATHSPEC_PREFER_FULL |
 				    PATHSPEC_SYMLINK_LEADING_PATH,
 				    prefix, pathspec_from_file, pathspec_file_nul);
@@ -504,7 +504,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			       PATHSPEC_LITERAL |
 			       PATHSPEC_GLOB |
 			       PATHSPEC_ICASE |
-			       PATHSPEC_EXCLUDE);
+			       PATHSPEC_EXCLUDE |
+			       PATHSPEC_ATTR);
 
 		for (i = 0; i < pathspec.nr; i++) {
 			const char *path = pathspec.items[i].match;
diff --git a/dir.c b/dir.c
index 16fdb03f2a..4d1cd039be 100644
--- a/dir.c
+++ b/dir.c
@@ -2179,7 +2179,8 @@ static int exclude_matches_pathspec(const char *path, int pathlen,
 		       PATHSPEC_LITERAL |
 		       PATHSPEC_GLOB |
 		       PATHSPEC_ICASE |
-		       PATHSPEC_EXCLUDE);
+		       PATHSPEC_EXCLUDE |
+		       PATHSPEC_ATTR);
 
 	for (i = 0; i < pathspec->nr; i++) {
 		const struct pathspec_item *item = &pathspec->items[i];
diff --git a/pathspec.c b/pathspec.c
index bb1efe1f39..d132eefac2 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -109,16 +109,37 @@ static struct pathspec_magic {
 	{ PATHSPEC_ATTR,    '\0', "attr" },
 };
 
-static void prefix_magic(struct strbuf *sb, int prefixlen, unsigned magic)
+static void prefix_magic(struct strbuf *sb, int prefixlen,
+			 unsigned magic, const char *element)
 {
-	int i;
-	strbuf_addstr(sb, ":(");
-	for (i = 0; i < ARRAY_SIZE(pathspec_magic); i++)
-		if (magic & pathspec_magic[i].bit) {
-			if (sb->buf[sb->len - 1] != '(')
-				strbuf_addch(sb, ',');
-			strbuf_addstr(sb, pathspec_magic[i].name);
+	/* No magic was found in element, just add prefix magic */
+	if (!magic) {
+		strbuf_addf(sb, ":(prefix:%d)", prefixlen);
+		return;
+	}
+
+	/*
+	 * At this point, we know that parse_element_magic() was able
+	 * to extract some pathspec magic from element. So we know
+	 * element is correctly formatted in either shorthand or
+	 * longhand form
+	 */
+	if (element[1] != '(') {
+		/* Process an element in shorthand form (e.g. ":!/<match>") */
+		strbuf_addstr(sb, ":(");
+		for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
+			if ((magic & pathspec_magic[i].bit) &&
+			    (pathspec_magic[i].mnemonic)) {
+				if (sb->buf[sb->len - 1] != '(')
+					strbuf_addch(sb, ',');
+				strbuf_addstr(sb, pathspec_magic[i].name);
+			}
 		}
+	} else {
+		/* For the longhand form, we copy everything up to the final ')' */
+		size_t len = strchr(element, ')') - element;
+		strbuf_add(sb, element, len);
+	}
 	strbuf_addf(sb, ",prefix:%d)", prefixlen);
 }
 
@@ -493,7 +514,7 @@ static void init_pathspec_item(struct pathspec_item *item, unsigned flags,
 		struct strbuf sb = STRBUF_INIT;
 
 		/* Preserve the actual prefix length of each pattern */
-		prefix_magic(&sb, prefixlen, element_magic);
+		prefix_magic(&sb, prefixlen, element_magic, elt);
 
 		strbuf_addstr(&sb, match);
 		item->original = strbuf_detach(&sb, NULL);
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index f70c395e75..e46fa176ed 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,12 +64,24 @@ test_expect_success 'setup .gitattributes' '
 	fileSetLabel label
 	fileValue label=foo
 	fileWrongLabel label☺
+	newFileA* labelA
+	newFileB* labelB
 	EOF
 	echo fileSetLabel label1 >sub/.gitattributes &&
 	git add .gitattributes sub/.gitattributes &&
 	git commit -m "add attributes"
 '
 
+test_expect_success 'setup .gitignore' '
+	cat <<-\EOF >.gitignore &&
+	actual
+	expect
+	pathspec_file
+	EOF
+	git add .gitignore &&
+	git commit -m "add gitignore"
+'
+
 test_expect_success 'check specific set attr' '
 	cat <<-\EOF >expect &&
 	fileSetLabel
@@ -150,6 +162,7 @@ test_expect_success 'check specific value attr (2)' '
 test_expect_success 'check unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileA
 	fileAB
 	fileAC
@@ -175,6 +188,7 @@ test_expect_success 'check unspecified attr' '
 test_expect_success 'check unspecified attr (2)' '
 	cat <<-\EOF >expect &&
 	HEAD:.gitattributes
+	HEAD:.gitignore
 	HEAD:fileA
 	HEAD:fileAB
 	HEAD:fileAC
@@ -200,6 +214,7 @@ test_expect_success 'check unspecified attr (2)' '
 test_expect_success 'check multiple unspecified attr' '
 	cat <<-\EOF >expect &&
 	.gitattributes
+	.gitignore
 	fileC
 	fileNoLabel
 	fileWrongLabel
@@ -239,16 +254,99 @@ test_expect_success 'fail on multiple attr specifiers in one pathspec item' '
 	test_i18ngrep "Only one" actual
 '
 
-test_expect_success 'fail if attr magic is used places not implemented' '
+test_expect_success 'fail if attr magic is used in places not implemented' '
 	# The main purpose of this test is to check that we actually fail
 	# when you attempt to use attr magic in commands that do not implement
-	# attr magic. This test does not advocate git-add to stay that way,
-	# though, but git-add is convenient as it has its own internal pathspec
-	# parsing.
-	test_must_fail git add ":(attr:labelB)" 2>actual &&
+	# attr magic. This test does not advocate check-ignore to stay that way.
+	# When you teach the command to grok the pathspec, you need to find
+	# another command to replace it for the test.
+	test_must_fail git check-ignore ":(attr:labelB)" 2>actual &&
 	test_i18ngrep "magic not supported" actual
 '
 
+test_expect_success 'check that attr magic works for git stash push' '
+	cat <<-\EOF >expect &&
+	A	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git stash push --include-untracked -- ":(exclude,attr:labelB)" &&
+	git stash show --include-untracked --name-status >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --all' '
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached >actual &&
+	git restore -W -S . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add -u' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add -u ":(exclude,attr:labelB)" &&
+	git diff --name-only --cached  >actual &&
+	git restore -S -W . && rm sub/new* &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add <path>' '
+	cat <<-\EOF >expect &&
+	fileA
+	fileB
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	git add ":(exclude,attr:labelB)sub/*" &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git -add .' '
+	cat <<-\EOF >expect &&
+	sub/fileA
+	EOF
+	>fileA &&
+	>fileB &&
+	>sub/fileA &&
+	>sub/fileB &&
+	cd sub &&
+	git add . ":(exclude,attr:labelB)" &&
+	cd .. &&
+	git diff --name-only --cached >actual &&
+	git restore -S -W . &&
+	test_cmp expect actual
+'
+
+test_expect_success 'check that attr magic works for git add --pathspec-from-file' '
+	cat <<-\EOF >pathspec_file &&
+	:(exclude,attr:labelB)
+	EOF
+	cat <<-\EOF >expect &&
+	sub/newFileA-foo
+	EOF
+	>sub/newFileA-foo &&
+	>sub/newFileB-foo &&
+	git add --all --pathspec-from-file=pathspec_file &&
+	git diff --name-only --cached >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'abort on giving invalid label on the command line' '
 	test_must_fail git ls-files . ":(attr:☺)"
 '
-- 
2.43.0-rc0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-04  5:11               ` Junio C Hamano
@ 2023-11-04  6:28                 ` Eric Sunshine
  2023-11-04  7:14                   ` Junio C Hamano
  2023-11-09 23:27                 ` Junio C Hamano
  1 sibling, 1 reply; 13+ messages in thread
From: Eric Sunshine @ 2023-11-04  6:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Joanna Wang, git

On Sat, Nov 4, 2023 at 1:12 AM Junio C Hamano <gitster@pobox.com> wrote:
> > +                     if ((magic & pathspec_magic[i].bit) && (pathspec_magic[i].mnemonic != '\0')) {
>
> Overly long, with a stylo.
>
> Here is what I ended up queuing.
>
> +               for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
> +                       if ((magic & pathspec_magic[i].bit) &&
> +                           (pathspec_magic[i].mnemonic)) {

Nit: The no-value-added parentheses around
`pathspec_magic[i].mnemonic` can also be dropped.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-04  6:28                 ` Eric Sunshine
@ 2023-11-04  7:14                   ` Junio C Hamano
  0 siblings, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2023-11-04  7:14 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Joanna Wang, git

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Sat, Nov 4, 2023 at 1:12 AM Junio C Hamano <gitster@pobox.com> wrote:
>> > +                     if ((magic & pathspec_magic[i].bit) && (pathspec_magic[i].mnemonic != '\0')) {
>>
>> Overly long, with a stylo.
>>
>> Here is what I ended up queuing.
>>
>> +               for (int i = 0; i < ARRAY_SIZE(pathspec_magic); i++) {
>> +                       if ((magic & pathspec_magic[i].bit) &&
>> +                           (pathspec_magic[i].mnemonic)) {
>
> Nit: The no-value-added parentheses around
> `pathspec_magic[i].mnemonic` can also be dropped.

Yeah, thanks for spotting.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
  2023-11-04  5:11               ` Junio C Hamano
  2023-11-04  6:28                 ` Eric Sunshine
@ 2023-11-09 23:27                 ` Junio C Hamano
  1 sibling, 0 replies; 13+ messages in thread
From: Junio C Hamano @ 2023-11-09 23:27 UTC (permalink / raw)
  To: Joanna Wang; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Thanks.  The compiled result from this version looks quite good.  As
> you started a new round with typofix, let me start the final
> nitpicking.
> ...
> Here is what I ended up queuing.
>
> Thanks.

Any comments on the "counterproposal to fix nitpicks"?  I'd like to
make sure the original author is either OK with the update (in which
case I can just move what I have in my tree forward) or unhappy with
it and plan to send in an update on their own (in which case I can
wait for the next iteration), in a case like this.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash
@ 2023-11-10  2:09 Joanna Wang
  0 siblings, 0 replies; 13+ messages in thread
From: Joanna Wang @ 2023-11-10  2:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Joanna Wang

whoops no comments. thank you for the review and making the fixes!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-11-10  2:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-10  2:09 [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash Joanna Wang
  -- strict thread matches above, loose matches on Subject: below --
2023-10-11 20:20 [PATCH 1/1] add: enable attr pathspec magic Joanna Wang
2023-11-02  1:55 ` [PATCH 1/1] attr: enable attr pathspec magic for git-add and git-stash Joanna Wang
2023-11-02  5:00   ` Junio C Hamano
2023-11-02 17:53     ` Joanna Wang
2023-11-02 21:32       ` Junio C Hamano
2023-11-02 23:45         ` Junio C Hamano
2023-11-03 14:35           ` Joanna Wang
2023-11-03 15:31             ` Ruben Safir
2023-11-03 16:34             ` Joanna Wang
2023-11-04  5:11               ` Junio C Hamano
2023-11-04  6:28                 ` Eric Sunshine
2023-11-04  7:14                   ` Junio C Hamano
2023-11-09 23:27                 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).