git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] attr: add native file mode values support
@ 2023-11-11  4:03 Joanna Wang
  2023-11-11  4:48 ` Junio C Hamano
  2023-11-13 16:50 ` Torsten Bögershausen
  0 siblings, 2 replies; 6+ messages in thread
From: Joanna Wang @ 2023-11-11  4:03 UTC (permalink / raw)
  To: git; +Cc: Joanna Wang

Gives all paths inherent 'mode' attribute values based on the paths'
modes (one of 100644, 100755, 120000, 040000, 160000). Users may use
this feature to filter by file types. For example a pathspec such as
':(attr:mode=160000)' could filter for submodules without needing
`mode=160000` to actually be specified for each subdmoule path in
.gitattributes. The native mode values are also reflected in
`git check-attr` results.

If there is any existing mode attribute for a path (e.g. there is
!mode, -mode, mode, mode=<value> in .gitattributes) that setting will
take precedence over the native mode value.

---

I went with 'mode' because that is the word used in documentation
(e.g. https://git-scm.com/book/sv/v2/Git-Internals-Git-Objects)
and in the code (e.g. `ce_mode`). Please let me know what you think
of this UX.

The implementation happens within git_check_attr() method which is
the common mathod called for getting a pathspec attribute value.

The previous discussed idea did not work with `git check-attr`.
(https://lore.kernel.org/all/CAMmZTi8swsSMcLUcW+YwUDg8GcrY_ks2+i35-nsHE3o9MNpsUQ@mail.gmail.com/).

There are no tests for excluding based on pathspec attrs in subdirectories
due to an existing bug. Since it is not specific to native mode, I thought
it would be better to fix separately.
https://lore.kernel.org/all/CAMmZTi89Un+bsLXdEdYs44oT8eLNp8y=Pm8ywaurcQ7ccRKGdQ@mail.gmail.com/

 Documentation/gitattributes.txt | 10 +++++++
 attr.c                          | 42 ++++++++++++++++++++++++--
 t/t0003-attributes.sh           | 40 +++++++++++++++++++++++++
 t/t6135-pathspec-with-attrs.sh  | 53 +++++++++++++++++++++++++++++++++
 4 files changed, 143 insertions(+), 2 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8c1793c148..bb3c11f151 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -156,6 +156,16 @@ Unspecified::
 Any other value causes Git to act as if `text` has been left
 unspecified.
 
+`mode`
+^^^^^
+
+This attribute is for filtering files by their file bit modes
+(40000, 120000, 160000, 100755, 100644) and has native support in git,
+meaning values for this attribute are natively set (e.g. mode=100644) by
+git without having to specify them in .gitattributes. However, if
+'mode' is set in .gitattributest for some path, that value takes precendence,
+whether it is 'set', 'unset', 'unspecified', or some other value.
+
 `eol`
 ^^^^^
 
diff --git a/attr.c b/attr.c
index e62876dfd3..95dc1cf695 100644
--- a/attr.c
+++ b/attr.c
@@ -1240,20 +1240,58 @@ static struct object_id *default_attr_source(void)
 	return &attr_source;
 }
 
+
+/*
+ * This implements native file mode attr support.
+ * We always return the current mode of the path, not the mode stored
+ * in the index, unless the path is a directory where we check the index
+ * to see if it is a GITLINK. It is ok to rely on the index for GITLINK
+ * modes because a path cannot become a GITLINK (which is a git concept only)
+ * without having it indexed with a GITLINK mode in git.
+ */
+static unsigned int native_mode_attr(struct index_state *istate, const char *path)
+{
+	struct stat st;
+	unsigned int mode;
+	if (lstat(path, &st))
+		die_errno(_("unable to stat '%s'"), path);
+	mode = canon_mode(st.st_mode);
+	if (S_ISDIR(mode)) {
+		int pos = index_name_pos(istate, path, strlen(path));
+		if (pos >= 0)
+			if (S_ISGITLINK(istate->cache[pos]->ce_mode))
+				return istate->cache[pos]->ce_mode;
+	}
+	return mode;
+}
+
+
 void git_check_attr(struct index_state *istate,
 		    const char *path,
 		    struct attr_check *check)
 {
 	int i;
 	const struct object_id *tree_oid = default_attr_source();
+	struct strbuf sb = STRBUF_INIT;
 
 	collect_some_attrs(istate, tree_oid, path, check);
 
 	for (i = 0; i < check->nr; i++) {
 		unsigned int n = check->items[i].attr->attr_nr;
 		const char *value = check->all_attrs[n].value;
-		if (value == ATTR__UNKNOWN)
-			value = ATTR__UNSET;
+		if (value == ATTR__UNKNOWN){
+			if (strcmp(check->all_attrs[n].attr->name, "mode") == 0) {
+				/*
+				 * If value is ATTR_UNKNOWN then it has not been set
+				 * anywhere with -mode (ATTR_FALSE), !mode (ATTR_UNSET),
+				 * mode (ATTR_TRUE), or an explicit value. We fill
+				 * value with the native mode value.
+				 */
+				strbuf_addf(&sb, "%06o", native_mode_attr(istate, path));
+				value = sb.buf;
+			} else
+				value = ATTR__UNSET;
+		}
 		check->items[i].value = value;
 	}
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index aee2298f01..9c2603d8e2 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -19,6 +19,15 @@ attr_check () {
 	test_must_be_empty err
 }
 
+attr_check_mode () {
+	path="$1" expect="$2" git_opts="$3" &&
+
+	git $git_opts check-attr mode -- "$path" >actual 2>err &&
+	echo "$path: mode: $expect" >expect &&
+	test_cmp expect actual &&
+	test_must_be_empty err
+}
+
 attr_check_quote () {
 	path="$1" quoted_path="$2" expect="$3" &&
 
@@ -558,4 +567,35 @@ test_expect_success EXPENSIVE 'large attributes file ignored in index' '
 	test_cmp expect err
 '
 
+test_expect_success 'submodule with .git file' '
+	mkdir sub &&
+	(cd sub &&
+	git init &&
+	mv .git .real &&
+	echo "gitdir: .real" >.git &&
+	test_commit first) &&
+	git update-index --add -- sub
+'
+
+test_expect_success 'native mode attributes work' '
+	>exec && chmod +x exec && attr_check_mode exec 100755 &&
+	>normal && attr_check_mode normal 100644 &&
+	mkdir dir && attr_check_mode dir 040000 &&
+	ln -s normal normal_sym && attr_check_mode normal_sym 120000 &&
+	attr_check_mode sub 160000
+'
+
+test_expect_success '.gitattributes mode values take precedence' '
+	(
+		echo "mode_value* mode=myownmode" &&
+		echo "mode_set* mode" &&
+		echo "mode_unset* -mode" &&
+		echo "mode_unspecified* !mode"
+	) >.gitattributes &&
+	>mode_value && attr_check_mode mode_value myownmode &&
+	>mode_unset && attr_check_mode mode_unset unset &&
+	>mode_unspecified && attr_check_mode mode_unspecified unspecified &&
+	>mode_set && attr_check_mode mode_set set
+'
+
 test_done
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index a9c1e4e0ec..fd9569d39b 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,6 +64,9 @@ test_expect_success 'setup .gitattributes' '
 	fileSetLabel label
 	fileValue label=foo
 	fileWrongLabel label☺
+	mode_set* mode=1234
+	mode_unset* -mode
+	mode_unspecified* !mode
 	EOF
 	echo fileSetLabel label1 >sub/.gitattributes &&
 	git add .gitattributes sub/.gitattributes &&
@@ -295,4 +298,54 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
 	test_cmp expect actual
 '
 
+test_expect_success 'mode attr is handled correctly for overriden values' '
+	>mode_set_1 &&
+	>mode_unset_1 &&
+	>mode_unspecified_1 &&
+	>mode_regular_file_1 &&
+
+	git status -s ":(attr:mode=1234)mode*" >actual &&
+	cat <<-\EOF >expect &&
+	?? mode_set_1
+	EOF
+	test_cmp expect actual &&
+
+	git status -s ":(attr:-mode)mode*" >actual &&
+	echo ?? mode_unset_1 >expect &&
+	test_cmp expect actual &&
+
+	git status -s ":(attr:!mode)mode*" >actual &&
+	echo ?? mode_unspecified_1 >expect &&
+	test_cmp expect actual &&
+
+	git status -s ":(attr:mode=100644)mode*" >actual &&
+	echo ?? mode_regular_file_1 >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'mode attr values are current file modes, not indexed modes' '
+	>mode_exec_file_1 &&
+
+	git status -s ":(attr:mode=100644)mode_exec_*" >actual &&
+	echo ?? mode_exec_file_1 >expect &&
+	test_cmp expect actual &&
+
+	git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
+	git status -s ":(attr:mode=100755)mode_exec_*" >actual &&
+	echo AM mode_exec_file_1 >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'mode attr can be excluded' '
+	>mode_1_regular &&
+	>mode_1_exec  && chmod +x mode_1_exec &&
+	git status -s ":(exclude,attr:mode=100644)" "mode_1_*" >actual &&
+	echo ?? mode_1_exec >expect &&
+	test_cmp expect actual &&
+
+	git status -s ":(exclude,attr:mode=100755)" "mode_1_*" >actual &&
+	echo ?? mode_1_regular >expect &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] attr: add native file mode values support
@ 2023-11-14  2:28 Joanna Wang
  2023-11-14  2:52 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Joanna Wang @ 2023-11-14  2:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Joanna Wang, tboegi

>> Even if we assume that this code is currently meant to work only
>> with GIT_ATTR_CHECKIN, I do not think this is what you want.  When
>> asked to perform "git add . ':(exclude,mode=160000)'", you not only
>> want to exclude the submodules that are already known to this
>> superproject, but also a repository that _can_ become a submodule of
>> this superproject when added, no?
Sorry, I was totally ignorant of two key concepts that you mention here.
I somehow missed the concept of git_attr_direction altogether and I also
did not know submodules could be added with git-add (I was only
familiar with how
they are currently added in chromium via git update-index).
This makes sense now. I'll fix in the next patch.

>> On the other hand, the GIT_ATTR_CHECKOUT direction is hopefully much
>> simpler.  You'd see what the path in the index is, among a gitlink,
>> a regular non-executable file, an executable file, or a symlink.
I noticed for both the GIT_ATTR_CHECKOUT and GIT_ATTR_CHECKIN directions,
in read_attr(), the indexed .gitattributes file is checked with the
actual file as fallback or vice versa.
I would think that we'd only want to use attributes from one state
(e.g. what's actually in the file)
or the other (e.g. what's indexed).
So I guess I'm still not sure what the "direction" concept is.
For GIT_ATTR_CHECKOUT, would we want to fallback to lstat?

>> "ls-tree" documentation seems to call it %(objectmode).
I can change it to 'objectmode' in a followup patch, if there are no
objections to this.

>> I think the idea is that "mode" being a too generic word can be used
>> for totally different purposes
My thinking was, no matter how generic or rare a name we choose, there
is always a chance
no matter how tiny, that the name will be in use in someone's .gitattributes.
But if people are ok with choosing something less generic
and have that become a 'reserved' attribute name and not have any
existing values in .gitattributes
take precedence I can do that.
I just don't know how git has historically balanced breaking existing
workflows vs easier/more reasonable
implementation/behavior.

>> But stepping back a bit,
>> such an application is likely marking selected few paths with the
>> attribute, and paths for which "mode" was "unset" are now given
>> these natural "mode"; it is inevitable to crash with such uses.
I'm confused. This does not match what I think is the current behavior
of my patch.
If "mode" was unset or removed for a path (meaning '<path> !mode' was
added to .gitattributes),
the code in my patch would respect that and not return the native mode.
It would return 'unspecified' or 'unset'. I have tests for these in
the patch, but if I've missed some
test cases please let me know.


>> Again, this has one hole, I think.  Paths that are not mentioned
>> (not even with "!mode") would come to the function as ATTR__UNKNOWN
>> and trigger the fallback behaviour, even when other paths are given
>> end-user specified "mode" attribute values.
What you are describing sounds correct/what I intended.
So are you saying that the expected behavior is actually:
If the user sets 'mode' for 1+ paths in the repo, then the native mode
fallback should
NOT be used for all paths in the repo?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-11-14  2:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-11  4:03 [PATCH 1/1] attr: add native file mode values support Joanna Wang
2023-11-11  4:48 ` Junio C Hamano
2023-11-13 16:50 ` Torsten Bögershausen
2023-11-13 22:22   ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2023-11-14  2:28 Joanna Wang
2023-11-14  2:52 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).