git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Non-blob .gitmodules and .gitattributes
@ 2024-06-14 12:03 Alexey Pelykh
  2024-06-14 15:35 ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Alexey Pelykh @ 2024-06-14 12:03 UTC (permalink / raw)
  To: git

Greetings to all!

I've stumbled upon a check that ensures that .gitmodules and .gitattributes must be blobs, yet while I get why they should not be e.g. a symlink, what's the downside of permitting (and ignoring for the actual purpose) e.g. .gitattributes folder?

To better understand my use-case:

$ ls -lA dev/shared
total 8
drwxr-xr-x   3 alexey-pelykh  staff   96 Jun 14 11:53 .editorconfig
drwxr-xr-x   3 alexey-pelykh  staff   96 Jun 14 13:32 .gitattributes
drwxr-xr-x  14 alexey-pelykh  staff  448 Jun 14 11:53 .gitignore
-rwxr-xr-x   1 alexey-pelykh  staff  613 Jun 14 13:33 share

and

$ ls -lA dev/shared/.*
dev/shared/.editorconfig:
total 8
-rw-r--r--  1 alexey-pelykh  staff  377 Jun 14 11:53 shared.editorconfig

dev/shared/.gitattributes:
total 8
-rw-r--r--  1 alexey-pelykh  staff  143 Jun 14 13:32 shared.gitattributes

dev/shared/.gitignore:
total 8
-rw-r--r--  1 alexey-pelykh  staff  4450 Jun 14 11:53 shared.gitignore

Surely, the immediate workaround is use a "_gitattributes" or ".gitattributes_" name for the folder, yet if a directory with .gitignore can exist - why can't .gitmodule or .gitattributes?

Be well,
Alexey

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Non-blob .gitmodules and .gitattributes
  2024-06-14 12:03 Non-blob .gitmodules and .gitattributes Alexey Pelykh
@ 2024-06-14 15:35 ` Junio C Hamano
  2024-06-14 15:43   ` Alexey Pelykh
  2024-06-18 18:31   ` Jeff King
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2024-06-14 15:35 UTC (permalink / raw)
  To: git; +Cc: Alexey Pelykh

Alexey Pelykh <alexey.pelykh@gmail.com> writes:

> I've stumbled upon a check that ensures that .gitmodules and
> .gitattributes must be blobs,

Alexey, thanks for a report.

For those who want to take a look, unfortunately Alexey does not
show any error message or describe the end-user visible effect that
the "check" caused, so there is a bit too little to go on.

> yet while I get why they should not
> be e.g. a symlink, what's the downside of permitting (and ignoring
> for the actual purpose) e.g. .gitattributes folder?

My knee-jerk reaction is that we probably can safely loosen by
ignoring non-blob .gitWHATEVER files, but security-minded folks may
be able to come up with some plausible attack scenarios if we did
so.

Comments from those who have worked on transfer time and runtime
checks on these are highly appreciated.

Having said that, the checks for .gitmodules and .gitattributes in
fsck.c first collect objects that tree entries with these names
point at into oidsets (this all happens in fsck.c:fsck_tree()), but
the actual check for these found objects are done only when they are
blobs.  Only when we encounter a blob object, these oidsets are
looked at in fsck.c:fsck_blob(), and if it is .gitmodules its
contents inspected (and may result in a warning or an error).  So
the "checks" Alexey reports may not be in the runtime or transfer
time checks done in fsck but something else.  I dunno.

Thanks.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Non-blob .gitmodules and .gitattributes
  2024-06-14 15:35 ` Junio C Hamano
@ 2024-06-14 15:43   ` Alexey Pelykh
  2024-06-18 18:31   ` Jeff King
  1 sibling, 0 replies; 17+ messages in thread
From: Alexey Pelykh @ 2024-06-14 15:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Dear Junio,

Thanks for providing more insight!

The error I'm seeing due to ".gitattributes" folder being present in the diff via the ".gitattributes/shared.gitattributes" file is the following:

error in tree f8db7c96f9daf3ea8486804a3f717a807fc1a1d8: gitattributesBlob: non-blob found at .gitattributes

I'm seeing this on both "git push" and "git fsck". Steps to reproduce:
$ mkdir test
$ cd test
$ git init
$ mkdir -p subdir/.gitattributes
$ touch subdir/.gitattributes/some-file
$ git add touch subdir/.gitattributes/some-file
$ git commit -m "test"

$ git fsck
Checking object directories: 100% (256/256), done.
error in tree 3666f1677ba5c7ec7e69544510a0d8b99a71774a: gitattributesBlob: non-blob found at .gitattributes

git-push will give the same, at least when pushing to GitHub.

Kind regards,
Alexey

> On 14 Jun 2024, at 17:35, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Alexey Pelykh <alexey.pelykh@gmail.com> writes:
> 
>> I've stumbled upon a check that ensures that .gitmodules and
>> .gitattributes must be blobs,
> 
> Alexey, thanks for a report.
> 
> For those who want to take a look, unfortunately Alexey does not
> show any error message or describe the end-user visible effect that
> the "check" caused, so there is a bit too little to go on.
> 
>> yet while I get why they should not
>> be e.g. a symlink, what's the downside of permitting (and ignoring
>> for the actual purpose) e.g. .gitattributes folder?
> 
> My knee-jerk reaction is that we probably can safely loosen by
> ignoring non-blob .gitWHATEVER files, but security-minded folks may
> be able to come up with some plausible attack scenarios if we did
> so.
> 
> Comments from those who have worked on transfer time and runtime
> checks on these are highly appreciated.
> 
> Having said that, the checks for .gitmodules and .gitattributes in
> fsck.c first collect objects that tree entries with these names
> point at into oidsets (this all happens in fsck.c:fsck_tree()), but
> the actual check for these found objects are done only when they are
> blobs.  Only when we encounter a blob object, these oidsets are
> looked at in fsck.c:fsck_blob(), and if it is .gitmodules its
> contents inspected (and may result in a warning or an error).  So
> the "checks" Alexey reports may not be in the runtime or transfer
> time checks done in fsck but something else.  I dunno.
> 
> Thanks.
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Non-blob .gitmodules and .gitattributes
  2024-06-14 15:35 ` Junio C Hamano
  2024-06-14 15:43   ` Alexey Pelykh
@ 2024-06-18 18:31   ` Jeff King
  2024-06-18 19:07     ` Alexey Pelykh
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2024-06-18 18:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Alexey Pelykh

On Fri, Jun 14, 2024 at 08:35:56AM -0700, Junio C Hamano wrote:

> My knee-jerk reaction is that we probably can safely loosen by
> ignoring non-blob .gitWHATEVER files, but security-minded folks may
> be able to come up with some plausible attack scenarios if we did
> so.
> 
> Comments from those who have worked on transfer time and runtime
> checks on these are highly appreciated.

I can't think of an immediate problem security-wise, since we know that
Git won't actually read the trees. In fact, I was a little surprised
that normal Git commands outside of fsck did not run into problems with
a .gitattributes directory. What happens on Linux, at least, is that we
try to fopen() the directory, which works, and then read() returns
EISDIR. But because we do so through strbuf_getline(), it just looks
like EOF and we don't complain.  Arguably that code (and all the other
loops like it) should check ferror() and complain.

But I'd also suspect that other non-POSIX platforms would see the error
at open(), and _would_ actually produce an error message. I seem to
recall running into this before with Windows, maybe?

So even if we loosened fsck, I'm not sure if it's something we want to
support. And of course my bigger question is: why? These are reserved
names that have special meaning to Git. Sticking stuff that Git doesn't
understand and may produce errors for seems odd.

> Having said that, the checks for .gitmodules and .gitattributes in
> fsck.c first collect objects that tree entries with these names
> point at into oidsets (this all happens in fsck.c:fsck_tree()), but
> the actual check for these found objects are done only when they are
> blobs.  Only when we encounter a blob object, these oidsets are
> looked at in fsck.c:fsck_blob(), and if it is .gitmodules its
> contents inspected (and may result in a warning or an error).  So
> the "checks" Alexey reports may not be in the runtime or transfer
> time checks done in fsck but something else.  I dunno.

The fsck checks will kick in. When we fsck the containing tree, we learn
that some oid X is a .gitmodules file, and queue that. The hope is that
later we're fsck-ing X anyway, and we get to validate for free-ish. But
if we _don't_ see it (either we checked the blob before the tree, or
perhaps the blob is not even part of the set of objects we're checking),
then at the end we validate any queued items that are left.

We have to do it this way to catch the case of somebody pushing up blob
X in one push (perhaps with the name "foo"), and then doing a second
push that references it with a new name (i.e., renaming "foo" to
".gitmodules").

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Non-blob .gitmodules and .gitattributes
  2024-06-18 18:31   ` Jeff King
@ 2024-06-18 19:07     ` Alexey Pelykh
  2024-06-18 20:14       ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Alexey Pelykh @ 2024-06-18 19:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

Hi Jeff,

> But I'd also suspect that other non-POSIX platforms would see the error
> at open(), and _would_ actually produce an error message. I seem to
> recall running into this before with Windows, maybe?

Is there an easy way to verify "what on Windows" part? I'd be happy to
help, yet I'm not sure I got it right what to look for.

> So even if we loosened fsck, I'm not sure if it's something we want to
> support. And of course my bigger question is: why? These are reserved
> names that have special meaning to Git. Sticking stuff that Git doesn't
> understand and may produce errors for seems odd.

Surely, reserved names are reserved for some reason. If there's a legit
reason alike cost-to-support, having an objection would be dumb. Yet
if supporting would turn out to be of an effort alike dropping the check
then it would seem having that check brings no value.

With that said, if those are reserved names, why .gitignore is not reserved?
For consistency at least.

Cheers,
Alexey

> On 18 Jun 2024, at 20:31, Jeff King <peff@peff.net> wrote:
> 
> On Fri, Jun 14, 2024 at 08:35:56AM -0700, Junio C Hamano wrote:
> 
>> My knee-jerk reaction is that we probably can safely loosen by
>> ignoring non-blob .gitWHATEVER files, but security-minded folks may
>> be able to come up with some plausible attack scenarios if we did
>> so.
>> 
>> Comments from those who have worked on transfer time and runtime
>> checks on these are highly appreciated.
> 
> I can't think of an immediate problem security-wise, since we know that
> Git won't actually read the trees. In fact, I was a little surprised
> that normal Git commands outside of fsck did not run into problems with
> a .gitattributes directory. What happens on Linux, at least, is that we
> try to fopen() the directory, which works, and then read() returns
> EISDIR. But because we do so through strbuf_getline(), it just looks
> like EOF and we don't complain.  Arguably that code (and all the other
> loops like it) should check ferror() and complain.
> 
> But I'd also suspect that other non-POSIX platforms would see the error
> at open(), and _would_ actually produce an error message. I seem to
> recall running into this before with Windows, maybe?
> 
> So even if we loosened fsck, I'm not sure if it's something we want to
> support. And of course my bigger question is: why? These are reserved
> names that have special meaning to Git. Sticking stuff that Git doesn't
> understand and may produce errors for seems odd.
> 
>> Having said that, the checks for .gitmodules and .gitattributes in
>> fsck.c first collect objects that tree entries with these names
>> point at into oidsets (this all happens in fsck.c:fsck_tree()), but
>> the actual check for these found objects are done only when they are
>> blobs.  Only when we encounter a blob object, these oidsets are
>> looked at in fsck.c:fsck_blob(), and if it is .gitmodules its
>> contents inspected (and may result in a warning or an error).  So
>> the "checks" Alexey reports may not be in the runtime or transfer
>> time checks done in fsck but something else.  I dunno.
> 
> The fsck checks will kick in. When we fsck the containing tree, we learn
> that some oid X is a .gitmodules file, and queue that. The hope is that
> later we're fsck-ing X anyway, and we get to validate for free-ish. But
> if we _don't_ see it (either we checked the blob before the tree, or
> perhaps the blob is not even part of the set of objects we're checking),
> then at the end we validate any queued items that are left.
> 
> We have to do it this way to catch the case of somebody pushing up blob
> X in one push (perhaps with the name "foo"), and then doing a second
> push that references it with a new name (i.e., renaming "foo" to
> ".gitmodules").
> 
> -Peff


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Non-blob .gitmodules and .gitattributes
  2024-06-18 19:07     ` Alexey Pelykh
@ 2024-06-18 20:14       ` Junio C Hamano
  2024-06-18 23:33         ` Jeff King
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2024-06-18 20:14 UTC (permalink / raw)
  To: Alexey Pelykh; +Cc: Jeff King, git

Alexey Pelykh <alexey.pelykh@gmail.com> writes:

>> But I'd also suspect that other non-POSIX platforms would see the error
>> at open(), and _would_ actually produce an error message. I seem to
>> recall running into this before with Windows, maybe?
>
> Is there an easy way to verify "what on Windows" part? I'd be happy to
> help, yet I'm not sure I got it right what to look for.

Create a .gitmodules and .gitattributes directory, "git add" it, and
perform various operations that want to read them, probably.  Even a
simple "git diff" should try to consult the attribute system (e.g.
because it wants to know if a path needs use custom function header
regexp pattern).  As Peff said, on Linux and probably on macOS, we
will silently ignore such .gitattributes and that is what we want.
On Windows we may see "cannot open" error reported and visible to
the users.

> Surely, reserved names are reserved for some reason. If there's a legit
> reason alike cost-to-support, having an objection would be dumb. Yet
> if supporting would turn out to be of an effort alike dropping the check
> then it would seem having that check brings no value.
>
> With that said, if those are reserved names, why .gitignore is not reserved?
> For consistency at least.

We do not bother with ".gitignore" since we see no security
implications in that file.  But other two whose name begin with
".git" do have some security implications (actually .gitattributes
is designed not to have any, but .gitmodules certainly does), so
we choose to inspect their sizes, contents, and types.

But that raises a few more questions.

What to do about our future needs that may conflict the needs of
users who want to use names of their choice?  If we declare that
any name that begin with ".git" is reserved, there will be fewer
issues, but do we want to and can we afford to?

Also what if we later find security implications in ".gitignore"
files and decide to inspect their sizes, contents, and types?  The
current system not issuing a warning does not give users any
guarantee that the future systems won't (even though we try hard to
avoid introducing such backward incompatible changes, we are human,
too, and we sometimes screw up).

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Non-blob .gitmodules and .gitattributes
  2024-06-18 20:14       ` Junio C Hamano
@ 2024-06-18 23:33         ` Jeff King
  2024-06-18 23:44           ` [PATCH 0/4] .git{ignore,attributes} directories? Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff King @ 2024-06-18 23:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Alexey Pelykh, git

On Tue, Jun 18, 2024 at 01:14:57PM -0700, Junio C Hamano wrote:

> Alexey Pelykh <alexey.pelykh@gmail.com> writes:
> 
> >> But I'd also suspect that other non-POSIX platforms would see the error
> >> at open(), and _would_ actually produce an error message. I seem to
> >> recall running into this before with Windows, maybe?
> >
> > Is there an easy way to verify "what on Windows" part? I'd be happy to
> > help, yet I'm not sure I got it right what to look for.
> 
> Create a .gitmodules and .gitattributes directory, "git add" it, and
> perform various operations that want to read them, probably.  Even a
> simple "git diff" should try to consult the attribute system (e.g.
> because it wants to know if a path needs use custom function header
> regexp pattern).  As Peff said, on Linux and probably on macOS, we
> will silently ignore such .gitattributes and that is what we want.
> On Windows we may see "cannot open" error reported and visible to
> the users.

I don't think you even need to "git add" it. Just making it in the
working tree is enough for us to look at it. You probably do need an
actual change in a tracked file for "git diff" to bother looking up
attributes (an alternative is just "git show" on an existing commit).

Interestingly, a directory .gitmodules _does_ complain:

  $ git status
  warning: unable to access '/home/peff/tmp/.gitmodules': Is a directory
  On branch main
  nothing to commit, working tree clean

That's because it uses fopen(), rather than open(). And we have a
wrapper that complains about directories there, triggered by the
FREAD_READS_DIRECTORIES knob in the Makefile. It's set for Linux, but
not for Windows (which is why I suspect Windows might just complain by
itself). We don't see that for ".gitattributes" or ".gitignore" because
those use a raw open() syscall, and Linux is happy to allow it. Arguably
that's a bug, as the intent of that knob is that we'd notice and avoid
trying to read from directories.

So even putting fsck warnings aside, I think it's mostly luck that
having .git* files as directories doesn't produce noisy warnings during
normal use.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 0/4] .git{ignore,attributes} directories?
  2024-06-18 23:33         ` Jeff King
@ 2024-06-18 23:44           ` Junio C Hamano
  2024-06-18 23:44             ` [PATCH 1/4] .gitignore: introduce GITIGNORE_FILE CPP macro Junio C Hamano
                               ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Junio C Hamano @ 2024-06-18 23:44 UTC (permalink / raw)
  To: git

We seem to blindly open .gitignore that is a directory and rely on
subsequent read() to be more-or-less silent to implement a "non-file
.gitignore is silently ignored" behaviour.

Let's be a bit more strict in detecting and reporting I/O errors,
and also stop reading from directories.

I think the first three are reasonable changes, but the last one is
of dubious value.

Junio C Hamano (4):
  .gitignore: introduce GITIGNORE_FILE CPP macro
  attr: notice and report read failure
  exclude: notice and report read failure of .gitignore files
  submodule: ignore .gitmodules that is not a regular file

 attr.c                | 10 +++++++++-
 builtin/read-tree.c   |  3 ++-
 dir.c                 | 16 ++++++++++++++--
 dir.h                 |  1 +
 environment.h         |  1 +
 submodule-config.c    |  2 +-
 t/t0003-attributes.sh |  9 +++++++++
 t/t0008-ignores.sh    | 18 ++++++++++++++++++
 8 files changed, 55 insertions(+), 5 deletions(-)

-- 
2.45.2-711-gd2c001ca14


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] .gitignore: introduce GITIGNORE_FILE CPP macro
  2024-06-18 23:44           ` [PATCH 0/4] .git{ignore,attributes} directories? Junio C Hamano
@ 2024-06-18 23:44             ` Junio C Hamano
  2024-06-18 23:44             ` [PATCH 2/4] attr: notice and report read failure of .gitattributes files Junio C Hamano
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2024-06-18 23:44 UTC (permalink / raw)
  To: git

GITATTRIBUTES_FILE and GITMODULES_FILE have their own CPP macros
that resolve to their string literals to help compilers catch typos.
Add one for ".gitignore" as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/read-tree.c | 3 ++-
 dir.c               | 4 ++--
 environment.h       | 1 +
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index a8cf8504b8..bc12f5bd16 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -6,6 +6,7 @@
 
 #include "builtin.h"
 #include "config.h"
+#include "environment.h"
 #include "gettext.h"
 #include "hex.h"
 #include "lockfile.h"
@@ -65,7 +66,7 @@ static int exclude_per_directory_cb(const struct option *opt, const char *arg,
 
 	if (!opts->update)
 		die("--exclude-per-directory is meaningless unless -u");
-	if (strcmp(arg, ".gitignore"))
+	if (strcmp(arg, GITIGNORE_FILE))
 		die("--exclude-per-directory argument must be .gitignore");
 	return 0;
 }
diff --git a/dir.c b/dir.c
index 45be4ad261..ec875e3878 100644
--- a/dir.c
+++ b/dir.c
@@ -2888,7 +2888,7 @@ static void new_untracked_cache(struct index_state *istate, int flags)
 {
 	struct untracked_cache *uc = xcalloc(1, sizeof(*uc));
 	strbuf_init(&uc->ident, 100);
-	uc->exclude_per_dir = ".gitignore";
+	uc->exclude_per_dir = GITIGNORE_FILE;
 	uc->dir_flags = flags >= 0 ? flags : new_untracked_cache_flags(istate);
 	set_untracked_ident(uc);
 	istate->untracked = uc;
@@ -3428,7 +3428,7 @@ static GIT_PATH_FUNC(git_path_info_exclude, "info/exclude")
 
 void setup_standard_excludes(struct dir_struct *dir)
 {
-	dir->exclude_per_dir = ".gitignore";
+	dir->exclude_per_dir = GITIGNORE_FILE;
 
 	/* core.excludesfile defaulting to $XDG_CONFIG_HOME/git/ignore */
 	if (!excludes_file)
diff --git a/environment.h b/environment.h
index e9f01d4d11..39c3c24a3f 100644
--- a/environment.h
+++ b/environment.h
@@ -43,6 +43,7 @@ const char *getenv_safe(struct strvec *argv, const char *name);
 #define GITMODULES_FILE ".gitmodules"
 #define GITMODULES_INDEX ":.gitmodules"
 #define GITMODULES_HEAD "HEAD:.gitmodules"
+#define GITIGNORE_FILE ".gitignore"
 #define GIT_NOTES_REF_ENVIRONMENT "GIT_NOTES_REF"
 #define GIT_NOTES_DEFAULT_REF "refs/notes/commits"
 #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF"
-- 
2.45.2-711-gd2c001ca14


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] attr: notice and report read failure of .gitattributes files
  2024-06-18 23:44           ` [PATCH 0/4] .git{ignore,attributes} directories? Junio C Hamano
  2024-06-18 23:44             ` [PATCH 1/4] .gitignore: introduce GITIGNORE_FILE CPP macro Junio C Hamano
@ 2024-06-18 23:44             ` Junio C Hamano
  2024-06-19  0:21               ` Eric Sunshine
  2024-06-19 13:57               ` Jeff King
  2024-06-18 23:44             ` [PATCH 3/4] exclude: notice and report read failure of .gitignore files Junio C Hamano
  2024-06-18 23:44             ` [PATCH 4/4] submodule: ignore .gitmodules that is not a regular file Junio C Hamano
  3 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2024-06-18 23:44 UTC (permalink / raw)
  To: git

The code that reads .gitattributes files was careless in dealing in
unusual circumstances.

 - It let read errors silently ignored.

 - It tried to read ".gitattributes" that is a directory on
   platforms that allowed open(2) to open directories.

To make the latter consistent with what we do for fopen() on
directories ("git grep" for FREAD_READS_DIRECTORIES for details),
check if we opened a directory, silently close it and return
success.  Catch all read errors before we close and report as
needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 attr.c                | 10 +++++++++-
 t/t0003-attributes.sh |  9 +++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/attr.c b/attr.c
index 300f994ba6..9ab9cf1551 100644
--- a/attr.c
+++ b/attr.c
@@ -747,6 +747,11 @@ static struct attr_stack *read_attr_from_file(const char *path, unsigned flags)
 		fclose(fp);
 		return NULL;
 	}
+	if (S_ISDIR(st.st_mode)) {
+		/* On FREAD_READS_DIRECTORIES platforms */
+		fclose(fp);
+		return NULL;
+	}
 	if (st.st_size >= ATTR_MAX_FILE_SIZE) {
 		warning(_("ignoring overly large gitattributes file '%s'"), path);
 		fclose(fp);
@@ -760,7 +765,10 @@ static struct attr_stack *read_attr_from_file(const char *path, unsigned flags)
 		handle_attr_line(res, buf.buf, path, ++lineno, flags);
 	}
 
-	fclose(fp);
+	if (ferror(fp))
+		warning_errno(_("failed while reading gitattributes file '%s'"), path);
+	if (fclose(fp))
+		warning_errno(_("cannot fclose gitattributes file '%s'"), path);
 	strbuf_release(&buf);
 	return res;
 }
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index 66ccb5889d..783c20146d 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -603,6 +603,15 @@ test_expect_success EXPENSIVE 'large attributes blob ignored' '
 	test_cmp expect err
 '
 
+test_expect_success '.gitattribute directory behaves as if it does not exist' '
+	test_when_finished "rm -fr dir" &&
+	mkdir -p dir/.gitattributes &&
+	>dir/ectory &&
+	git -C dir check-attr --all ectory >out 2>err &&
+	test_grep ! "failed while reading" err &&
+	test_grep ! "cannot fclose" err
+'
+
 test_expect_success 'builtin object mode attributes work (dir and regular paths)' '
 	>normal &&
 	attr_check_object_mode normal 100644 &&
-- 
2.45.2-711-gd2c001ca14


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] exclude: notice and report read failure of .gitignore files
  2024-06-18 23:44           ` [PATCH 0/4] .git{ignore,attributes} directories? Junio C Hamano
  2024-06-18 23:44             ` [PATCH 1/4] .gitignore: introduce GITIGNORE_FILE CPP macro Junio C Hamano
  2024-06-18 23:44             ` [PATCH 2/4] attr: notice and report read failure of .gitattributes files Junio C Hamano
@ 2024-06-18 23:44             ` Junio C Hamano
  2024-06-18 23:44             ` [PATCH 4/4] submodule: ignore .gitmodules that is not a regular file Junio C Hamano
  3 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2024-06-18 23:44 UTC (permalink / raw)
  To: git

The code that reads .gitignore files was careless in dealing in
unusual circumstances.

 - It let read errors silently ignored.

 - It tried to read ".gitignore" that is a directory on platforms
   that allow open(2) to open directories.

To make the latter consistent with what we do for fopen() on
directories ("git grep" for FREAD_READS_DIRECTORIES for details),
check if we opened a directory, silently close it and return
success.  Catch all read errors before we close and report as
needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 dir.c              |  6 ++++++
 t/t0008-ignores.sh | 18 ++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/dir.c b/dir.c
index ec875e3878..73f89f4d8c 100644
--- a/dir.c
+++ b/dir.c
@@ -1127,6 +1127,10 @@ static int add_patterns(const char *fname, const char *base, int baselen,
 						       oid_stat);
 		if (r != 1)
 			return r;
+	} else if (S_ISDIR(st.st_mode)) {
+		/* On FREAD_READS_DIRECTORIES platforms */
+		close(fd);
+		return 0;
 	} else {
 		size = xsize_t(st.st_size);
 		if (size == 0) {
@@ -1140,6 +1144,8 @@ static int add_patterns(const char *fname, const char *base, int baselen,
 		}
 		buf = xmallocz(size);
 		if (read_in_full(fd, buf, size) != size) {
+			warning_errno(_("failed while reading gitignore file '%s'"),
+				      fname);
 			free(buf);
 			close(fd);
 			return -1;
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index 02a18d4fdb..c367824c66 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -953,4 +953,22 @@ test_expect_success EXPENSIVE 'large exclude file ignored in tree' '
 	test_cmp expect err
 '
 
+test_expect_success POSIXPERM 'unreadable exclude file reported' '
+	test_when_finished "rm -f .gitignore" &&
+	>.gitignore &&
+	chmod a= .gitignore &&
+	# we do not care if the pattern matches
+	{ git check-ignore xyzzy 2>err || :; } &&
+	test_grep "unable to access ${SQ}\.gitignore${SQ}:" err
+'
+
+test_expect_success '.gitignore directory ignored' '
+	test_when_finished "rm -rf .gitignore" &&
+	rm -f .gitignore &&
+	mkdir .gitignore &&
+	# we do not care if the pattern matches
+	{ git check-ignore xyzzy 2>err || :; } &&
+	test_grep ! "failed while reading gitignore file ${SQ}\.gitignore${SQ}:" err
+'
+
 test_done
-- 
2.45.2-711-gd2c001ca14


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] submodule: ignore .gitmodules that is not a regular file
  2024-06-18 23:44           ` [PATCH 0/4] .git{ignore,attributes} directories? Junio C Hamano
                               ` (2 preceding siblings ...)
  2024-06-18 23:44             ` [PATCH 3/4] exclude: notice and report read failure of .gitignore files Junio C Hamano
@ 2024-06-18 23:44             ` Junio C Hamano
  3 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2024-06-18 23:44 UTC (permalink / raw)
  To: git

If .gitmodules exists in the working tree but is a directory, it
would have just tried to use it as if it were a file.  On a platform
that needs FREAD_READS_DIRECTORIES, this would have been hidden by
our own fopen() that pretends as if directory did not exist, so it
is a no-op.  Just to add some documentation value, make sure we
check with file_exists_as_file() instead of file_exists(), the
latter of which will be happy as long as the given path exists no
matter what it is.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 dir.c              | 6 ++++++
 dir.h              | 1 +
 submodule-config.c | 2 +-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 73f89f4d8c..d943da93df 100644
--- a/dir.c
+++ b/dir.c
@@ -3142,6 +3142,12 @@ int file_exists(const char *f)
 	return lstat(f, &sb) == 0;
 }
 
+int file_exists_as_file(const char *path)
+{
+	struct stat st;
+	return lstat(path, &st) == 0 && S_ISREG(st.st_mode);
+}
+
 int repo_file_exists(struct repository *repo, const char *path)
 {
 	if (repo != the_repository)
diff --git a/dir.h b/dir.h
index 1398a53fb4..3612dbbf9e 100644
--- a/dir.h
+++ b/dir.h
@@ -475,6 +475,7 @@ void dir_clear(struct dir_struct *dir);
 
 int repo_file_exists(struct repository *repo, const char *path);
 int file_exists(const char *);
+int file_exists_as_file(const char *);
 
 int is_inside_dir(const char *dir);
 int dir_inside_of(const char *subdir, const char *dir);
diff --git a/submodule-config.c b/submodule-config.c
index ec45ea67b9..6c18ae3764 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -801,7 +801,7 @@ static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void
 		char *oidstr = NULL;
 
 		file = repo_worktree_path(repo, GITMODULES_FILE);
-		if (file_exists(file)) {
+		if (file_exists_as_file(file)) {
 			config_source.file = file;
 		} else if (repo_get_oid(repo, GITMODULES_INDEX, &oid) >= 0 ||
 			   repo_get_oid(repo, GITMODULES_HEAD, &oid) >= 0) {
-- 
2.45.2-711-gd2c001ca14


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] attr: notice and report read failure of .gitattributes files
  2024-06-18 23:44             ` [PATCH 2/4] attr: notice and report read failure of .gitattributes files Junio C Hamano
@ 2024-06-19  0:21               ` Eric Sunshine
  2024-06-19  1:18                 ` Junio C Hamano
  2024-06-19 13:57               ` Jeff King
  1 sibling, 1 reply; 17+ messages in thread
From: Eric Sunshine @ 2024-06-19  0:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jun 18, 2024 at 7:46 PM Junio C Hamano <gitster@pobox.com> wrote:
> The code that reads .gitattributes files was careless in dealing in
> unusual circumstances.
>
>  - It let read errors silently ignored.

ECANTPARSE: perhaps? s/errors/& be/

>  - It tried to read ".gitattributes" that is a directory on
>    platforms that allowed open(2) to open directories.
>
> To make the latter consistent with what we do for fopen() on
> directories ("git grep" for FREAD_READS_DIRECTORIES for details),
> check if we opened a directory, silently close it and return
> success.  Catch all read errors before we close and report as
> needed.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] attr: notice and report read failure of .gitattributes files
  2024-06-19  0:21               ` Eric Sunshine
@ 2024-06-19  1:18                 ` Junio C Hamano
  2024-06-19  2:35                   ` Eric Sunshine
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2024-06-19  1:18 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git

Eric Sunshine <sunshine@sunshineco.com> writes:

> ECANTPARSE: perhaps? s/errors/& be/

You're right.  Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] attr: notice and report read failure of .gitattributes files
  2024-06-19  1:18                 ` Junio C Hamano
@ 2024-06-19  2:35                   ` Eric Sunshine
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Sunshine @ 2024-06-19  2:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jun 18, 2024 at 9:18 PM Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> > ECANTPARSE: perhaps? s/errors/& be/
>
> You're right.  Thanks.

The same problem occurs in the commit message of patch [3/4].

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] attr: notice and report read failure of .gitattributes files
  2024-06-18 23:44             ` [PATCH 2/4] attr: notice and report read failure of .gitattributes files Junio C Hamano
  2024-06-19  0:21               ` Eric Sunshine
@ 2024-06-19 13:57               ` Jeff King
  2024-06-20 16:20                 ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2024-06-19 13:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, Jun 18, 2024 at 04:44:33PM -0700, Junio C Hamano wrote:

> The code that reads .gitattributes files was careless in dealing in
> unusual circumstances.
> 
>  - It let read errors silently ignored.
> 
>  - It tried to read ".gitattributes" that is a directory on
>    platforms that allowed open(2) to open directories.
> 
> To make the latter consistent with what we do for fopen() on
> directories ("git grep" for FREAD_READS_DIRECTORIES for details),
> check if we opened a directory, silently close it and return
> success.  Catch all read errors before we close and report as
> needed.

Makes sense, but...

> diff --git a/attr.c b/attr.c
> index 300f994ba6..9ab9cf1551 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -747,6 +747,11 @@ static struct attr_stack *read_attr_from_file(const char *path, unsigned flags)
>  		fclose(fp);
>  		return NULL;
>  	}
> +	if (S_ISDIR(st.st_mode)) {
> +		/* On FREAD_READS_DIRECTORIES platforms */
> +		fclose(fp);
> +		return NULL;
> +	}

I don't know that this is really consistent with callers of fopen(),
since they tend to complain noisily. Usually via warn_on_fopen_errors(),
which we ourselves call above.

I.e., if we were using fopen() here rather than open()+fdopen(), our
call would likewise complain noisily. And we even did so prior to
2ef579e261 (attr: do not respect symlinks for in-tree .gitattributes,
2021-02-16)! We had to switch to open() then to use O_NOFOLLOW.

And I notice that every caller of open_nofollow() does an fdopen()
anyway. So I wonder if the better solution would be to make an
fopen_nofollow() that handles the FREAD_READS_DIRECTORIES stuff itself,
just like fopen() does?


That is also an interesting data point regarding the "is it sane to have
directory .gitattributes" question. Older versions would definitely
complain about it (once per file in the diff even):

  $ mkdir .gitattributes
  $ git.v2.31.0 show >/dev/null
  warning: unable to access '.gitattributes': Is a directory
  warning: unable to access '.gitattributes': Is a directory

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] attr: notice and report read failure of .gitattributes files
  2024-06-19 13:57               ` Jeff King
@ 2024-06-20 16:20                 ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2024-06-20 16:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> I don't know that this is really consistent with callers of fopen(),
> since they tend to complain noisily. Usually via warn_on_fopen_errors(),
> which we ourselves call above.

Argh, you're right.  I thought the guard in warn_on_fopen_errors()
would catch the artificial error code we give when we found that we
opened a directory (by mistake), but the error we use is EISDIR, so
it will be shown.

OK, let's scrap the whole thing.  It does not look like it is
solving any real-world problems.

Thanks.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-06-20 16:20 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-14 12:03 Non-blob .gitmodules and .gitattributes Alexey Pelykh
2024-06-14 15:35 ` Junio C Hamano
2024-06-14 15:43   ` Alexey Pelykh
2024-06-18 18:31   ` Jeff King
2024-06-18 19:07     ` Alexey Pelykh
2024-06-18 20:14       ` Junio C Hamano
2024-06-18 23:33         ` Jeff King
2024-06-18 23:44           ` [PATCH 0/4] .git{ignore,attributes} directories? Junio C Hamano
2024-06-18 23:44             ` [PATCH 1/4] .gitignore: introduce GITIGNORE_FILE CPP macro Junio C Hamano
2024-06-18 23:44             ` [PATCH 2/4] attr: notice and report read failure of .gitattributes files Junio C Hamano
2024-06-19  0:21               ` Eric Sunshine
2024-06-19  1:18                 ` Junio C Hamano
2024-06-19  2:35                   ` Eric Sunshine
2024-06-19 13:57               ` Jeff King
2024-06-20 16:20                 ` Junio C Hamano
2024-06-18 23:44             ` [PATCH 3/4] exclude: notice and report read failure of .gitignore files Junio C Hamano
2024-06-18 23:44             ` [PATCH 4/4] submodule: ignore .gitmodules that is not a regular file Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).