* [PATCH 1/1] attr: add native file mode values support
@ 2023-11-11 4:03 Joanna Wang
2023-11-11 4:48 ` Junio C Hamano
2023-11-13 16:50 ` Torsten Bögershausen
0 siblings, 2 replies; 6+ messages in thread
From: Joanna Wang @ 2023-11-11 4:03 UTC (permalink / raw)
To: git; +Cc: Joanna Wang
Gives all paths inherent 'mode' attribute values based on the paths'
modes (one of 100644, 100755, 120000, 040000, 160000). Users may use
this feature to filter by file types. For example a pathspec such as
':(attr:mode=160000)' could filter for submodules without needing
`mode=160000` to actually be specified for each subdmoule path in
.gitattributes. The native mode values are also reflected in
`git check-attr` results.
If there is any existing mode attribute for a path (e.g. there is
!mode, -mode, mode, mode=<value> in .gitattributes) that setting will
take precedence over the native mode value.
---
I went with 'mode' because that is the word used in documentation
(e.g. https://git-scm.com/book/sv/v2/Git-Internals-Git-Objects)
and in the code (e.g. `ce_mode`). Please let me know what you think
of this UX.
The implementation happens within git_check_attr() method which is
the common mathod called for getting a pathspec attribute value.
The previous discussed idea did not work with `git check-attr`.
(https://lore.kernel.org/all/CAMmZTi8swsSMcLUcW+YwUDg8GcrY_ks2+i35-nsHE3o9MNpsUQ@mail.gmail.com/).
There are no tests for excluding based on pathspec attrs in subdirectories
due to an existing bug. Since it is not specific to native mode, I thought
it would be better to fix separately.
https://lore.kernel.org/all/CAMmZTi89Un+bsLXdEdYs44oT8eLNp8y=Pm8ywaurcQ7ccRKGdQ@mail.gmail.com/
Documentation/gitattributes.txt | 10 +++++++
attr.c | 42 ++++++++++++++++++++++++--
t/t0003-attributes.sh | 40 +++++++++++++++++++++++++
t/t6135-pathspec-with-attrs.sh | 53 +++++++++++++++++++++++++++++++++
4 files changed, 143 insertions(+), 2 deletions(-)
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8c1793c148..bb3c11f151 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -156,6 +156,16 @@ Unspecified::
Any other value causes Git to act as if `text` has been left
unspecified.
+`mode`
+^^^^^
+
+This attribute is for filtering files by their file bit modes
+(40000, 120000, 160000, 100755, 100644) and has native support in git,
+meaning values for this attribute are natively set (e.g. mode=100644) by
+git without having to specify them in .gitattributes. However, if
+'mode' is set in .gitattributest for some path, that value takes precendence,
+whether it is 'set', 'unset', 'unspecified', or some other value.
+
`eol`
^^^^^
diff --git a/attr.c b/attr.c
index e62876dfd3..95dc1cf695 100644
--- a/attr.c
+++ b/attr.c
@@ -1240,20 +1240,58 @@ static struct object_id *default_attr_source(void)
return &attr_source;
}
+
+/*
+ * This implements native file mode attr support.
+ * We always return the current mode of the path, not the mode stored
+ * in the index, unless the path is a directory where we check the index
+ * to see if it is a GITLINK. It is ok to rely on the index for GITLINK
+ * modes because a path cannot become a GITLINK (which is a git concept only)
+ * without having it indexed with a GITLINK mode in git.
+ */
+static unsigned int native_mode_attr(struct index_state *istate, const char *path)
+{
+ struct stat st;
+ unsigned int mode;
+ if (lstat(path, &st))
+ die_errno(_("unable to stat '%s'"), path);
+ mode = canon_mode(st.st_mode);
+ if (S_ISDIR(mode)) {
+ int pos = index_name_pos(istate, path, strlen(path));
+ if (pos >= 0)
+ if (S_ISGITLINK(istate->cache[pos]->ce_mode))
+ return istate->cache[pos]->ce_mode;
+ }
+ return mode;
+}
+
+
void git_check_attr(struct index_state *istate,
const char *path,
struct attr_check *check)
{
int i;
const struct object_id *tree_oid = default_attr_source();
+ struct strbuf sb = STRBUF_INIT;
collect_some_attrs(istate, tree_oid, path, check);
for (i = 0; i < check->nr; i++) {
unsigned int n = check->items[i].attr->attr_nr;
const char *value = check->all_attrs[n].value;
- if (value == ATTR__UNKNOWN)
- value = ATTR__UNSET;
+ if (value == ATTR__UNKNOWN){
+ if (strcmp(check->all_attrs[n].attr->name, "mode") == 0) {
+ /*
+ * If value is ATTR_UNKNOWN then it has not been set
+ * anywhere with -mode (ATTR_FALSE), !mode (ATTR_UNSET),
+ * mode (ATTR_TRUE), or an explicit value. We fill
+ * value with the native mode value.
+ */
+ strbuf_addf(&sb, "%06o", native_mode_attr(istate, path));
+ value = sb.buf;
+ } else
+ value = ATTR__UNSET;
+ }
check->items[i].value = value;
}
}
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index aee2298f01..9c2603d8e2 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -19,6 +19,15 @@ attr_check () {
test_must_be_empty err
}
+attr_check_mode () {
+ path="$1" expect="$2" git_opts="$3" &&
+
+ git $git_opts check-attr mode -- "$path" >actual 2>err &&
+ echo "$path: mode: $expect" >expect &&
+ test_cmp expect actual &&
+ test_must_be_empty err
+}
+
attr_check_quote () {
path="$1" quoted_path="$2" expect="$3" &&
@@ -558,4 +567,35 @@ test_expect_success EXPENSIVE 'large attributes file ignored in index' '
test_cmp expect err
'
+test_expect_success 'submodule with .git file' '
+ mkdir sub &&
+ (cd sub &&
+ git init &&
+ mv .git .real &&
+ echo "gitdir: .real" >.git &&
+ test_commit first) &&
+ git update-index --add -- sub
+'
+
+test_expect_success 'native mode attributes work' '
+ >exec && chmod +x exec && attr_check_mode exec 100755 &&
+ >normal && attr_check_mode normal 100644 &&
+ mkdir dir && attr_check_mode dir 040000 &&
+ ln -s normal normal_sym && attr_check_mode normal_sym 120000 &&
+ attr_check_mode sub 160000
+'
+
+test_expect_success '.gitattributes mode values take precedence' '
+ (
+ echo "mode_value* mode=myownmode" &&
+ echo "mode_set* mode" &&
+ echo "mode_unset* -mode" &&
+ echo "mode_unspecified* !mode"
+ ) >.gitattributes &&
+ >mode_value && attr_check_mode mode_value myownmode &&
+ >mode_unset && attr_check_mode mode_unset unset &&
+ >mode_unspecified && attr_check_mode mode_unspecified unspecified &&
+ >mode_set && attr_check_mode mode_set set
+'
+
test_done
diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
index a9c1e4e0ec..fd9569d39b 100755
--- a/t/t6135-pathspec-with-attrs.sh
+++ b/t/t6135-pathspec-with-attrs.sh
@@ -64,6 +64,9 @@ test_expect_success 'setup .gitattributes' '
fileSetLabel label
fileValue label=foo
fileWrongLabel label☺
+ mode_set* mode=1234
+ mode_unset* -mode
+ mode_unspecified* !mode
EOF
echo fileSetLabel label1 >sub/.gitattributes &&
git add .gitattributes sub/.gitattributes &&
@@ -295,4 +298,54 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
test_cmp expect actual
'
+test_expect_success 'mode attr is handled correctly for overriden values' '
+ >mode_set_1 &&
+ >mode_unset_1 &&
+ >mode_unspecified_1 &&
+ >mode_regular_file_1 &&
+
+ git status -s ":(attr:mode=1234)mode*" >actual &&
+ cat <<-\EOF >expect &&
+ ?? mode_set_1
+ EOF
+ test_cmp expect actual &&
+
+ git status -s ":(attr:-mode)mode*" >actual &&
+ echo ?? mode_unset_1 >expect &&
+ test_cmp expect actual &&
+
+ git status -s ":(attr:!mode)mode*" >actual &&
+ echo ?? mode_unspecified_1 >expect &&
+ test_cmp expect actual &&
+
+ git status -s ":(attr:mode=100644)mode*" >actual &&
+ echo ?? mode_regular_file_1 >expect &&
+ test_cmp expect actual
+'
+
+test_expect_success 'mode attr values are current file modes, not indexed modes' '
+ >mode_exec_file_1 &&
+
+ git status -s ":(attr:mode=100644)mode_exec_*" >actual &&
+ echo ?? mode_exec_file_1 >expect &&
+ test_cmp expect actual &&
+
+ git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
+ git status -s ":(attr:mode=100755)mode_exec_*" >actual &&
+ echo AM mode_exec_file_1 >expect &&
+ test_cmp expect actual
+'
+
+test_expect_success 'mode attr can be excluded' '
+ >mode_1_regular &&
+ >mode_1_exec && chmod +x mode_1_exec &&
+ git status -s ":(exclude,attr:mode=100644)" "mode_1_*" >actual &&
+ echo ?? mode_1_exec >expect &&
+ test_cmp expect actual &&
+
+ git status -s ":(exclude,attr:mode=100755)" "mode_1_*" >actual &&
+ echo ?? mode_1_regular >expect &&
+ test_cmp expect actual
+'
+
test_done
--
2.42.0.869.gea05f2083d-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] attr: add native file mode values support
2023-11-11 4:03 [PATCH 1/1] attr: add native file mode values support Joanna Wang
@ 2023-11-11 4:48 ` Junio C Hamano
2023-11-13 16:50 ` Torsten Bögershausen
1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2023-11-11 4:48 UTC (permalink / raw)
To: Joanna Wang; +Cc: git
Joanna Wang <jojwang@google.com> writes:
> +/*
> + * This implements native file mode attr support.
> + * We always return the current mode of the path, not the mode stored
> + * in the index, unless the path is a directory where we check the index
> + * to see if it is a GITLINK. It is ok to rely on the index for GITLINK
> + * modes because a path cannot become a GITLINK (which is a git concept only)
> + * without having it indexed with a GITLINK mode in git.
> + */
> +static unsigned int native_mode_attr(struct index_state *istate, const char *path)
> +{
> + struct stat st;
> + unsigned int mode;
Please have a blank line here between decls and the first statement.
> + if (lstat(path, &st))
> + die_errno(_("unable to stat '%s'"), path);
For checking in, this is probably OK, but don't we need to switch
between lstat() on the filesystem entity and inspecting ce_mode()
for the in-index cache_entry, depending on the git_attr_direction?
> + mode = canon_mode(st.st_mode);
> + if (S_ISDIR(mode)) {
> + int pos = index_name_pos(istate, path, strlen(path));
> + if (pos >= 0)
> + if (S_ISGITLINK(istate->cache[pos]->ce_mode))
> + return istate->cache[pos]->ce_mode;
Even if we assume that this code is currently meant to work only
with GIT_ATTR_CHECKIN, I do not think this is what you want. When
asked to perform "git add . ':(exclude,mode=160000)'", you not only
want to exclude the submodules that are already known to this
superproject, but also a repository that _can_ become a submodule of
this superproject when added, no? You are missing "if (pos < 0)"
case where you'd probably want to see if the directory at the path
looks like the top of a working tree with ".git" subdirectory that
is a valid repository, or something like that to detect such a case.
On the other hand, the GIT_ATTR_CHECKOUT direction is hopefully much
simpler. You'd see what the path in the index is, among a gitlink,
a regular non-executable file, an executable file, or a symlink.
> + }
> + return mode;
> +}
> +
> +
> void git_check_attr(struct index_state *istate,
> const char *path,
> struct attr_check *check)
> {
> int i;
> const struct object_id *tree_oid = default_attr_source();
> + struct strbuf sb = STRBUF_INIT;
>
> collect_some_attrs(istate, tree_oid, path, check);
>
> for (i = 0; i < check->nr; i++) {
> unsigned int n = check->items[i].attr->attr_nr;
> const char *value = check->all_attrs[n].value;
> - if (value == ATTR__UNKNOWN)
> - value = ATTR__UNSET;
> + if (value == ATTR__UNKNOWN){
Style. Missing SP between "){".
> + if (strcmp(check->all_attrs[n].attr->name, "mode") == 0) {
Style. We avoid comparison with 0; so say
if (!strcmp(check->all_attrs[n].attr->name, "mode") == 0) {
instead.
It is disturbing that we always need to perform a string comparison
in this loop (and the function is called repeatedly while traversing
the tree looking for paths that match pathspec). The attribute objects
are designed to be interned, so at least you should be able to do
mode_attr = git_attr("mode");
before the loop and then compare check->all_attrs[n].attr and
mode_attr as two addresses.
> + /*
> + * If value is ATTR_UNKNOWN then it has not been set
> + * anywhere with -mode (ATTR_FALSE), !mode (ATTR_UNSET),
> + * mode (ATTR_TRUE), or an explicit value. We fill
> + * value with the native mode value.
> + */
> + strbuf_addf(&sb, "%06o", native_mode_attr(istate, path));
> + value = sb.buf;
Or even better yet, as this is not a place to write too many lines
for a single oddball attribute, it might make even more sense to
introduce a helper function and make a call to it like so:
if (value == ATTR__UNKNOWN)
- value = ATTR__UNSET;
+ value = compute_attr_from_path(istate, path, check_items[i].attr);
with the new helper function do all the heavy lifting, e.g.,
static const char *compute_attr_from_path(struct index_state *istate,
const char *path,
const struct git_attr *attr) {
static struct git_attr mode_attr;
if (!mode_attr)
mode_attr = git_attr("mode");
if (attr == mode_attr) {
call native_mode_attr(), stringify, and
return it;
}
return ATTR__UNSET;
}
which will allow us to easily add in the future special attribute
similar to "mode" whose fallback value is computed.
Otherwise, looking pretty good. Thanks for working on this.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] attr: add native file mode values support
2023-11-11 4:03 [PATCH 1/1] attr: add native file mode values support Joanna Wang
2023-11-11 4:48 ` Junio C Hamano
@ 2023-11-13 16:50 ` Torsten Bögershausen
2023-11-13 22:22 ` Junio C Hamano
1 sibling, 1 reply; 6+ messages in thread
From: Torsten Bögershausen @ 2023-11-13 16:50 UTC (permalink / raw)
To: Joanna Wang; +Cc: git
On Sat, Nov 11, 2023 at 04:03:08AM +0000, Joanna Wang wrote:
Some thoughts and comments inline...
> Gives all paths inherent 'mode' attribute values based on the paths'
> modes (one of 100644, 100755, 120000, 040000, 160000). Users may use
> this feature to filter by file types. For example a pathspec such as
> ':(attr:mode=160000)' could filter for submodules without needing
My spontanous feeling is that filetype may be another choice:
> ':(attr:filetype=160000)' could filter for submodules without needing
And having written this, we can think using something borrowed from
`find . -type f`
:(attr:filetype=f)' or :(attr:filetype=x)' (for executable)
> `mode=160000` to actually be specified for each subdmoule path in
> .gitattributes. The native mode values are also reflected in
> `git check-attr` results.
But then I missed the point why we need an attribute here?
The mode is already defined by the the file system (and Git),
is there a special reason that the user can define or re-define the
value here ?
May be there is, when working with pathspec.
But then "pathspec=" could be a better construction.
Since "mode" could make a reader think that Git does somewhat with the file
when checking out.
My personal hope reading "mode=100755" in .gitattributes would
be that Git makes it executable when checking out, if if it is
recorded in Git as 100644, probably coming from a file-system that
doesn't support the executable bit in a Unix way.
> If there is any existing mode attribute for a path (e.g. there is
> !mode, -mode, mode, mode=<value> in .gitattributes) that setting will
> take precedence over the native mode value.
>
> ---
>
> I went with 'mode' because that is the word used in documentation
> (e.g. https://git-scm.com/book/sv/v2/Git-Internals-Git-Objects)
> and in the code (e.g. `ce_mode`). Please let me know what you think
> of this UX.
>
> The implementation happens within git_check_attr() method which is
> the common mathod called for getting a pathspec attribute value.
>
> The previous discussed idea did not work with `git check-attr`.
> (https://lore.kernel.org/all/CAMmZTi8swsSMcLUcW+YwUDg8GcrY_ks2+i35-nsHE3o9MNpsUQ@mail.gmail.com/).
>
> There are no tests for excluding based on pathspec attrs in subdirectories
> due to an existing bug. Since it is not specific to native mode, I thought
> it would be better to fix separately.
> https://lore.kernel.org/all/CAMmZTi89Un+bsLXdEdYs44oT8eLNp8y=Pm8ywaurcQ7ccRKGdQ@mail.gmail.com/
Reading "pathspec attrs" above it feels better to call the new attribute pathspec.
But why should a user be allowed to overwrite what we have in the index ?
That is somewhat missing in the motivation for this change,
it would be good to have this explained in the commit message.
>
> Documentation/gitattributes.txt | 10 +++++++
> attr.c | 42 ++++++++++++++++++++++++--
> t/t0003-attributes.sh | 40 +++++++++++++++++++++++++
> t/t6135-pathspec-with-attrs.sh | 53 +++++++++++++++++++++++++++++++++
> 4 files changed, 143 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index 8c1793c148..bb3c11f151 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -156,6 +156,16 @@ Unspecified::
> Any other value causes Git to act as if `text` has been left
> unspecified.
>
> +`mode`
> +^^^^^
> +
> +This attribute is for filtering files by their file bit modes
> +(40000, 120000, 160000, 100755, 100644) and has native support in git,
> +meaning values for this attribute are natively set (e.g. mode=100644) by
> +git without having to specify them in .gitattributes. However, if
> +'mode' is set in .gitattributest for some path, that value takes precendence,
> +whether it is 'set', 'unset', 'unspecified', or some other value.
> +
> `eol`
> ^^^^^
>
> diff --git a/attr.c b/attr.c
> index e62876dfd3..95dc1cf695 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -1240,20 +1240,58 @@ static struct object_id *default_attr_source(void)
> return &attr_source;
> }
>
> +
> +/*
> + * This implements native file mode attr support.
> + * We always return the current mode of the path, not the mode stored
> + * in the index, unless the path is a directory where we check the index
> + * to see if it is a GITLINK. It is ok to rely on the index for GITLINK
> + * modes because a path cannot become a GITLINK (which is a git concept only)
> + * without having it indexed with a GITLINK mode in git.
> + */
> +static unsigned int native_mode_attr(struct index_state *istate, const char *path)
> +{
> + struct stat st;
> + unsigned int mode;
> + if (lstat(path, &st))
> + die_errno(_("unable to stat '%s'"), path);
> + mode = canon_mode(st.st_mode);
> + if (S_ISDIR(mode)) {
> + int pos = index_name_pos(istate, path, strlen(path));
> + if (pos >= 0)
> + if (S_ISGITLINK(istate->cache[pos]->ce_mode))
> + return istate->cache[pos]->ce_mode;
> + }
> + return mode;
> +}
> +
> +
> void git_check_attr(struct index_state *istate,
> const char *path,
> struct attr_check *check)
> {
> int i;
> const struct object_id *tree_oid = default_attr_source();
> + struct strbuf sb = STRBUF_INIT;
>
> collect_some_attrs(istate, tree_oid, path, check);
>
> for (i = 0; i < check->nr; i++) {
> unsigned int n = check->items[i].attr->attr_nr;
> const char *value = check->all_attrs[n].value;
> - if (value == ATTR__UNKNOWN)
> - value = ATTR__UNSET;
> + if (value == ATTR__UNKNOWN){
> + if (strcmp(check->all_attrs[n].attr->name, "mode") == 0) {
> + /*
> + * If value is ATTR_UNKNOWN then it has not been set
> + * anywhere with -mode (ATTR_FALSE), !mode (ATTR_UNSET),
> + * mode (ATTR_TRUE), or an explicit value. We fill
> + * value with the native mode value.
> + */
> + strbuf_addf(&sb, "%06o", native_mode_attr(istate, path));
> + value = sb.buf;
> + } else
> + value = ATTR__UNSET;
> + }
> check->items[i].value = value;
> }
> }
> diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
> index aee2298f01..9c2603d8e2 100755
> --- a/t/t0003-attributes.sh
> +++ b/t/t0003-attributes.sh
> @@ -19,6 +19,15 @@ attr_check () {
> test_must_be_empty err
> }
>
> +attr_check_mode () {
> + path="$1" expect="$2" git_opts="$3" &&
It would be easier to read, if each assignment is on it's on line
> +
> + git $git_opts check-attr mode -- "$path" >actual 2>err &&
> + echo "$path: mode: $expect" >expect &&
> + test_cmp expect actual &&
> + test_must_be_empty err
> +}
> +
> attr_check_quote () {
> path="$1" quoted_path="$2" expect="$3" &&
>
> @@ -558,4 +567,35 @@ test_expect_success EXPENSIVE 'large attributes file ignored in index' '
> test_cmp expect err
> '
>
> +test_expect_success 'submodule with .git file' '
> + mkdir sub &&
> + (cd sub &&
> + git init &&
> + mv .git .real &&
> + echo "gitdir: .real" >.git &&
> + test_commit first) &&
> + git update-index --add -- sub
Style and indentation (Please use TAB for indenting)
Using sub-shells is good. Somewhat easier to read would be this:
mkdir sub &&
(
cd sub &&
git init &&
mv .git .real &&
echo "gitdir: .real" >.git &&
test_commit first
) &&
> +'
> +
> +test_expect_success 'native mode attributes work' '
> + >exec && chmod +x exec && attr_check_mode exec 100755 &&
> + >normal && attr_check_mode normal 100644 &&
> + mkdir dir && attr_check_mode dir 040000 &&
> + ln -s normal normal_sym && attr_check_mode normal_sym 120000 &&
> + attr_check_mode sub 160000
> +'
We need a precondition here:
test_expect_success SYMLINKS
> +
> +test_expect_success '.gitattributes mode values take precedence' '
> + (
> + echo "mode_value* mode=myownmode" &&
> + echo "mode_set* mode" &&
> + echo "mode_unset* -mode" &&
> + echo "mode_unspecified* !mode"
> + ) >.gitattributes &&
Can this be written using a
cat <<-\EOF >expect &&
EOF
expression ?
> + >mode_value && attr_check_mode mode_value myownmode &&
> + >mode_unset && attr_check_mode mode_unset unset &&
> + >mode_unspecified && attr_check_mode mode_unspecified unspecified &&
> + >mode_set && attr_check_mode mode_set set
> +'
> +
> test_done
> diff --git a/t/t6135-pathspec-with-attrs.sh b/t/t6135-pathspec-with-attrs.sh
> index a9c1e4e0ec..fd9569d39b 100755
> --- a/t/t6135-pathspec-with-attrs.sh
> +++ b/t/t6135-pathspec-with-attrs.sh
> @@ -64,6 +64,9 @@ test_expect_success 'setup .gitattributes' '
> fileSetLabel label
> fileValue label=foo
> fileWrongLabel label☺
> + mode_set* mode=1234
> + mode_unset* -mode
> + mode_unspecified* !mode
> EOF
> echo fileSetLabel label1 >sub/.gitattributes &&
> git add .gitattributes sub/.gitattributes &&
> @@ -295,4 +298,54 @@ test_expect_success 'reading from .gitattributes in a subdirectory (3)' '
> test_cmp expect actual
> '
>
> +test_expect_success 'mode attr is handled correctly for overriden values' '
> + >mode_set_1 &&
> + >mode_unset_1 &&
> + >mode_unspecified_1 &&
> + >mode_regular_file_1 &&
> +
> + git status -s ":(attr:mode=1234)mode*" >actual &&
> + cat <<-\EOF >expect &&
> + ?? mode_set_1
> + EOF
> + test_cmp expect actual &&
> +
> + git status -s ":(attr:-mode)mode*" >actual &&
> + echo ?? mode_unset_1 >expect &&
> + test_cmp expect actual &&
> +
> + git status -s ":(attr:!mode)mode*" >actual &&
> + echo ?? mode_unspecified_1 >expect &&
> + test_cmp expect actual &&
> +
> + git status -s ":(attr:mode=100644)mode*" >actual &&
> + echo ?? mode_regular_file_1 >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'mode attr values are current file modes, not indexed modes' '
> + >mode_exec_file_1 &&
> +
> + git status -s ":(attr:mode=100644)mode_exec_*" >actual &&
> + echo ?? mode_exec_file_1 >expect &&
> + test_cmp expect actual &&
> +
> + git add mode_exec_file_1 && chmod +x mode_exec_file_1 &&
> + git status -s ":(attr:mode=100755)mode_exec_*" >actual &&
> + echo AM mode_exec_file_1 >expect &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success 'mode attr can be excluded' '
> + >mode_1_regular &&
> + >mode_1_exec && chmod +x mode_1_exec &&
> + git status -s ":(exclude,attr:mode=100644)" "mode_1_*" >actual &&
> + echo ?? mode_1_exec >expect &&
> + test_cmp expect actual &&
> +
> + git status -s ":(exclude,attr:mode=100755)" "mode_1_*" >actual &&
> + echo ?? mode_1_regular >expect &&
> + test_cmp expect actual
> +'
> +
> test_done
> --
> 2.42.0.869.gea05f2083d-goog
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] attr: add native file mode values support
2023-11-13 16:50 ` Torsten Bögershausen
@ 2023-11-13 22:22 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2023-11-13 22:22 UTC (permalink / raw)
To: Torsten Bögershausen; +Cc: Joanna Wang, git
Torsten Bögershausen <tboegi@web.de> writes:
> On Sat, Nov 11, 2023 at 04:03:08AM +0000, Joanna Wang wrote:
>
> Some thoughts and comments inline...
>
>> Gives all paths inherent 'mode' attribute values based on the paths'
>> modes (one of 100644, 100755, 120000, 040000, 160000). Users may use
>> this feature to filter by file types. For example a pathspec such as
>> ':(attr:mode=160000)' could filter for submodules without needing
>
> My spontanous feeling is that filetype may be another choice:
>> ':(attr:filetype=160000)' could filter for submodules without needing
I do agree that "mode" invites "mode of what???" reaction, and that
a term that narrows the scope would be preferrable. "Filemode" is a
bit questionable, though, as we give this permbits to non-files like
submodules. "ls-tree" documentation seems to call it %(objectmode).
> And having written this, we can think using something borrowed from
> `find . -type f`
>
> :(attr:filetype=f)' or :(attr:filetype=x)' (for executable)
This would not work for submodules, though. Naively one might want
to abuse 'd' but I suspect we would eventually want to be able to
give the mode bits to an out-of-cone directory storeed in the index
as a tree in a cone-mode sparse checkout, which would be 040000,
which deserves 'd' more than submodules.
> But then I missed the point why we need an attribute here?
> The mode is already defined by the the file system (and Git),
> is there a special reason that the user can define or re-define the
> value here ?
I think the idea is that "mode" being a too generic word can be used
for totally different purposes in existing projects and the addition
did not want to disturb their own use. But stepping back a bit,
such an application is likely marking selected few paths with the
attribute, and paths for which "mode" was "unset" are now given
these natural "mode"; it is inevitable to crash with such uses. If
we want to introduce "native" attributes of this kind, we would
probably need to carve out namespaces a bit more clearaly.
> May be there is, when working with pathspec.
> But then "pathspec=" could be a better construction.
> Since "mode" could make a reader think that Git does somewhat with the file
> when checking out.
> My personal hope reading "mode=100755" in .gitattributes would
> be that Git makes it executable when checking out, if if it is
> recorded in Git as 100644, probably coming from a file-system that
> doesn't support the executable bit in a Unix way.
That is not the intended way this attribute is to be used. Perhaps
we should make it an error (or ignored) when certain built-in/native
attributes are seen in the attribute file, but again that takes some
namespace carved out to avoid crashing with end-user names.
>> If there is any existing mode attribute for a path (e.g. there is
>> !mode, -mode, mode, mode=<value> in .gitattributes) that setting will
>> take precedence over the native mode value.
Again, this has one hole, I think. Paths that are not mentioned
(not even with "!mode") would come to the function as ATTR__UNKNOWN
and trigger the fallback behaviour, even when other paths are given
end-user specified "mode" attribute values.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] attr: add native file mode values support
@ 2023-11-14 2:28 Joanna Wang
2023-11-14 2:52 ` Junio C Hamano
0 siblings, 1 reply; 6+ messages in thread
From: Joanna Wang @ 2023-11-14 2:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Joanna Wang, tboegi
>> Even if we assume that this code is currently meant to work only
>> with GIT_ATTR_CHECKIN, I do not think this is what you want. When
>> asked to perform "git add . ':(exclude,mode=160000)'", you not only
>> want to exclude the submodules that are already known to this
>> superproject, but also a repository that _can_ become a submodule of
>> this superproject when added, no?
Sorry, I was totally ignorant of two key concepts that you mention here.
I somehow missed the concept of git_attr_direction altogether and I also
did not know submodules could be added with git-add (I was only
familiar with how
they are currently added in chromium via git update-index).
This makes sense now. I'll fix in the next patch.
>> On the other hand, the GIT_ATTR_CHECKOUT direction is hopefully much
>> simpler. You'd see what the path in the index is, among a gitlink,
>> a regular non-executable file, an executable file, or a symlink.
I noticed for both the GIT_ATTR_CHECKOUT and GIT_ATTR_CHECKIN directions,
in read_attr(), the indexed .gitattributes file is checked with the
actual file as fallback or vice versa.
I would think that we'd only want to use attributes from one state
(e.g. what's actually in the file)
or the other (e.g. what's indexed).
So I guess I'm still not sure what the "direction" concept is.
For GIT_ATTR_CHECKOUT, would we want to fallback to lstat?
>> "ls-tree" documentation seems to call it %(objectmode).
I can change it to 'objectmode' in a followup patch, if there are no
objections to this.
>> I think the idea is that "mode" being a too generic word can be used
>> for totally different purposes
My thinking was, no matter how generic or rare a name we choose, there
is always a chance
no matter how tiny, that the name will be in use in someone's .gitattributes.
But if people are ok with choosing something less generic
and have that become a 'reserved' attribute name and not have any
existing values in .gitattributes
take precedence I can do that.
I just don't know how git has historically balanced breaking existing
workflows vs easier/more reasonable
implementation/behavior.
>> But stepping back a bit,
>> such an application is likely marking selected few paths with the
>> attribute, and paths for which "mode" was "unset" are now given
>> these natural "mode"; it is inevitable to crash with such uses.
I'm confused. This does not match what I think is the current behavior
of my patch.
If "mode" was unset or removed for a path (meaning '<path> !mode' was
added to .gitattributes),
the code in my patch would respect that and not return the native mode.
It would return 'unspecified' or 'unset'. I have tests for these in
the patch, but if I've missed some
test cases please let me know.
>> Again, this has one hole, I think. Paths that are not mentioned
>> (not even with "!mode") would come to the function as ATTR__UNKNOWN
>> and trigger the fallback behaviour, even when other paths are given
>> end-user specified "mode" attribute values.
What you are describing sounds correct/what I intended.
So are you saying that the expected behavior is actually:
If the user sets 'mode' for 1+ paths in the repo, then the native mode
fallback should
NOT be used for all paths in the repo?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] attr: add native file mode values support
2023-11-14 2:28 Joanna Wang
@ 2023-11-14 2:52 ` Junio C Hamano
0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2023-11-14 2:52 UTC (permalink / raw)
To: Joanna Wang; +Cc: git, tboegi
Joanna Wang <jojwang@google.com> writes:
>>> Even if we assume that this code is currently meant to work only
>>> with GIT_ATTR_CHECKIN, I do not think this is what you want. When
>>> asked to perform "git add . ':(exclude,mode=160000)'", you not only
>>> want to exclude the submodules that are already known to this
>>> superproject, but also a repository that _can_ become a submodule of
>>> this superproject when added, no?
> Sorry, I was totally ignorant of ...
Nothing to be sorry about, and if I sounded like I was upset or
something, that wasn't my intention. Reviewers and more experienced
developers read posted patches to give pieces of advice exactly
because no single human is perfect and knows everything. We cover
holes in each others' knowledge and attention.
>>> On the other hand, the GIT_ATTR_CHECKOUT direction is hopefully much
>>> simpler. You'd see what the path in the index is, among a gitlink,
>>> a regular non-executable file, an executable file, or a symlink.
> I noticed for both the GIT_ATTR_CHECKOUT and GIT_ATTR_CHECKIN directions,
> in read_attr(), the indexed .gitattributes file is checked with the
> actual file as fallback or vice versa.
> I would think that we'd only want to use attributes from one state
> (e.g. what's actually in the file)
> or the other (e.g. what's indexed).
> So I guess I'm still not sure what the "direction" concept is.
> For GIT_ATTR_CHECKOUT, would we want to fallback to lstat?
When checking things out of the index, the index should be the
source of the truth. If something is not in the index, is there a
point in falling back to the workint tree state to decide if the
thing should be checked out of the index?
>>> But stepping back a bit,
>>> such an application is likely marking selected few paths with the
>>> attribute, and paths for which "mode" was "unset" are now given
>>> these natural "mode"; it is inevitable to crash with such uses.
> I'm confused. This does not match what I think is the current behavior
> of my patch.
> If "mode" was unset or removed for a path (meaning '<path> !mode' was
> added to .gitattributes),
> the code in my patch would respect that and not return the native mode.
> It would return 'unspecified' or 'unset'.
But the usual practice is *not* to cover all paths with explicit
attribute definition, isn't it? If somebody used the "foo"
attribute in their project to decide certain paths are worth giving
special treatments, there are paths with that attribute, perhaps
(totally made up example):
*.c foo=yes
Now, if you add a new "built-in" attribute next to 'mode' that
assigns "foo" in attr.c:git_check_attr() the same way to those paths
whose value is still ATTR__UNKNOWN after collect_some_attrs returns,
wouldn't a "hello.c" file get attribute 'foo' with value 'yes',
while a "hello.h" file (not mentioned by .gitattributes) will get
whatever value the built-in logic computed for it? If that existing
project were using "mode" (instead of "foo"), then doesn't this patch
change the behaviour for them?
>>> Again, this has one hole, I think. Paths that are not mentioned
>>> (not even with "!mode") would come to the function as ATTR__UNKNOWN
>>> and trigger the fallback behaviour, even when other paths are given
>>> end-user specified "mode" attribute values.
> What you are describing sounds correct/what I intended.
> So are you saying that the expected behavior is actually:
> If the user sets 'mode' for 1+ paths in the repo, then the native mode
> fallback should
> NOT be used for all paths in the repo?
That might be workable, but it breaks a well working system suddenly
when somebody uses "mode" in their .gitattibutes and we do not even
warn about the breakage, which is not ideal.
I think, when we see such an attribute is defined while reading from
.gitattributes and friends, we would probably want to warn and
ignore their definition altogether and use *only* the computed
value. This cannot be done in a backward compatible way, so it
would be better to carve out a namespace (and avoid overly generic
word like "mode") for ourselves to define built-in attributes that
do not overlap with end-user defined attribute names.
Perhaps call this 'builtin-object-mode' or something and document
that any attribute with a name that begins with 'builtin-' will
always get a computed value (possibly "unset"), it is an error
to define such an attribute in .gitattributes system, or something?
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-11-14 2:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-11 4:03 [PATCH 1/1] attr: add native file mode values support Joanna Wang
2023-11-11 4:48 ` Junio C Hamano
2023-11-13 16:50 ` Torsten Bögershausen
2023-11-13 22:22 ` Junio C Hamano
-- strict thread matches above, loose matches on Subject: below --
2023-11-14 2:28 Joanna Wang
2023-11-14 2:52 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).