* [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
@ 2025-05-02 23:33 Collin Funk
2025-05-03 0:57 ` brian m. carlson
2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk
0 siblings, 2 replies; 25+ messages in thread
From: Collin Funk @ 2025-05-02 23:33 UTC (permalink / raw)
To: git; +Cc: shejialuo, Collin Funk, Jeff King, Junio C Hamano
As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a
symlink returns -1 and sets errno to EFTYPE which differs from POSIX.
This patch fixes the following test failure:
$ sh t0602-reffiles-fsck.sh --verbose
--- expect 2025-05-02 23:05:23.920890147 +0000
+++ err 2025-05-02 23:05:23.916794959 +0000
@@ -1 +1 @@
-error: packed-refs: badRefFiletype: not a regular file but a symlink
+error: unable to open '.git/packed-refs': Inappropriate file type or format
not ok 12 - the filetype of packed-refs should be checked
This portability issue was introduced in Commit
cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28)
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
---
wrapper.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/wrapper.c b/wrapper.c
index 3c79778055..4d448d7c57 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename)
int open_nofollow(const char *path, int flags)
{
#ifdef O_NOFOLLOW
- return open(path, flags | O_NOFOLLOW);
+ int ret = open(path, flags | O_NOFOLLOW);
+#ifdef __NetBSD__
+ /*
+ * NetBSD sets errno to EFTYPE when path is a symlink. The only other
+ * time this errno occurs when O_REGULAR is used. Since we don't use
+ * it anywhere we can avoid an lstat here.
+ */
+ if (ret < 0 && errno == EFTYPE) {
+ errno = ELOOP;
+ return -1;
+ }
+#endif
+ return ret;
#else
struct stat st;
if (lstat(path, &st) < 0)
--
2.49.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-02 23:33 [PATCH] wrapper: Fix a errno discrepancy on NetBSD Collin Funk
@ 2025-05-03 0:57 ` brian m. carlson
2025-05-03 1:05 ` Junio C Hamano
` (2 more replies)
2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk
1 sibling, 3 replies; 25+ messages in thread
From: brian m. carlson @ 2025-05-03 0:57 UTC (permalink / raw)
To: Collin Funk; +Cc: git, shejialuo, Jeff King, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 2154 bytes --]
On 2025-05-02 at 23:33:32, Collin Funk wrote:
> As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a
> symlink returns -1 and sets errno to EFTYPE which differs from POSIX.
> This patch fixes the following test failure:
>
> $ sh t0602-reffiles-fsck.sh --verbose
> --- expect 2025-05-02 23:05:23.920890147 +0000
> +++ err 2025-05-02 23:05:23.916794959 +0000
> @@ -1 +1 @@
> -error: packed-refs: badRefFiletype: not a regular file but a symlink
> +error: unable to open '.git/packed-refs': Inappropriate file type or format
> not ok 12 - the filetype of packed-refs should be checked
>
> This portability issue was introduced in Commit
> cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28)
>
> Signed-off-by: Collin Funk <collin.funk1@gmail.com>
> ---
> wrapper.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/wrapper.c b/wrapper.c
> index 3c79778055..4d448d7c57 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename)
> int open_nofollow(const char *path, int flags)
> {
> #ifdef O_NOFOLLOW
> - return open(path, flags | O_NOFOLLOW);
> + int ret = open(path, flags | O_NOFOLLOW);
> +#ifdef __NetBSD__
> + /*
> + * NetBSD sets errno to EFTYPE when path is a symlink. The only other
> + * time this errno occurs when O_REGULAR is used. Since we don't use
> + * it anywhere we can avoid an lstat here.
> + */
> + if (ret < 0 && errno == EFTYPE) {
> + errno = ELOOP;
> + return -1;
> + }
> +#endif
> + return ret;
This patch seems reasonable and correct. I don't use NetBSD, but I do
often test there, and I'm aware of this infelicity. I'm surprised we
haven't hit it before.
I suspect we'll also hit this on FreeBSD, which has a similar issue in
that it returns `EMLINK` instead of `ELOOP`. I do wish these two OSes
would provide an appropriate POSIX-compatible `open` call when set with
`_POSIX_SOURCE`, since this is one of the biggest portability problems
with them.
--
brian m. carlson (they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 325 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 0:57 ` brian m. carlson
@ 2025-05-03 1:05 ` Junio C Hamano
2025-05-03 4:21 ` Collin Funk
2025-05-03 3:48 ` Collin Funk
2025-05-03 13:31 ` Jeff King
2 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2025-05-03 1:05 UTC (permalink / raw)
To: brian m. carlson; +Cc: Collin Funk, git, shejialuo, Jeff King
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> On 2025-05-02 at 23:33:32, Collin Funk wrote:
>> As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a
>> symlink returns -1 and sets errno to EFTYPE which differs from POSIX.
>> This patch fixes the following test failure:
>>
>> $ sh t0602-reffiles-fsck.sh --verbose
>> --- expect 2025-05-02 23:05:23.920890147 +0000
>> +++ err 2025-05-02 23:05:23.916794959 +0000
>> @@ -1 +1 @@
>> -error: packed-refs: badRefFiletype: not a regular file but a symlink
>> +error: unable to open '.git/packed-refs': Inappropriate file type or format
>> not ok 12 - the filetype of packed-refs should be checked
>>
>> This portability issue was introduced in Commit
>> cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28)
>>
>> Signed-off-by: Collin Funk <collin.funk1@gmail.com>
>> ---
>> wrapper.c | 14 +++++++++++++-
>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/wrapper.c b/wrapper.c
>> index 3c79778055..4d448d7c57 100644
>> --- a/wrapper.c
>> +++ b/wrapper.c
>> @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename)
>> int open_nofollow(const char *path, int flags)
>> {
>> #ifdef O_NOFOLLOW
>> - return open(path, flags | O_NOFOLLOW);
>> + int ret = open(path, flags | O_NOFOLLOW);
>> +#ifdef __NetBSD__
>> + /*
>> + * NetBSD sets errno to EFTYPE when path is a symlink. The only other
>> + * time this errno occurs when O_REGULAR is used. Since we don't use
>> + * it anywhere we can avoid an lstat here.
>> + */
>> + if (ret < 0 && errno == EFTYPE) {
>> + errno = ELOOP;
>> + return -1;
>> + }
>> +#endif
>> + return ret;
>
> This patch seems reasonable and correct. I don't use NetBSD, but I do
> often test there, and I'm aware of this infelicity. I'm surprised we
> haven't hit it before.
Thanks, both. Will queue after fixing the proposed log message a
bit (the sample must be indented, especially when it contains lines
that look like a patch).
> I suspect we'll also hit this on FreeBSD, which has a similar issue in
> that it returns `EMLINK` instead of `ELOOP`.
I won't expect Collin or you to redo this patch to cover FreeBSD;
anybody with FreeBSD box/vm can do a separate patch on a different
day.
> I do wish these two OSes
> would provide an appropriate POSIX-compatible `open` call when set with
> `_POSIX_SOURCE`, since this is one of the biggest portability problems
> with them.
That may be true, but not something we can fix here X-<.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 0:57 ` brian m. carlson
2025-05-03 1:05 ` Junio C Hamano
@ 2025-05-03 3:48 ` Collin Funk
2025-05-03 13:31 ` Jeff King
2 siblings, 0 replies; 25+ messages in thread
From: Collin Funk @ 2025-05-03 3:48 UTC (permalink / raw)
To: brian m. carlson; +Cc: git, shejialuo, Jeff King, Junio C Hamano
Hi Brian,
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> I suspect we'll also hit this on FreeBSD, which has a similar issue in
> that it returns `EMLINK` instead of `ELOOP`.
Good memory. I can confirm that FreeBSD fails in the same place with a
message for EMLINK. I'll write another patch for that.
> I do wish these two OSes would provide an appropriate POSIX-compatible
> `open` call when set with `_POSIX_SOURCE`, since this is one of the
> biggest portability problems with them.
It is documented in their man pages, so I assume it is intentional. But
I don't see any benefit to differing from POSIX here. I'll see if I can
find any discussion before I submit a bug report.
Collin
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-02 23:33 [PATCH] wrapper: Fix a errno discrepancy on NetBSD Collin Funk
2025-05-03 0:57 ` brian m. carlson
@ 2025-05-03 4:16 ` Collin Funk
2025-05-03 15:45 ` brian m. carlson
` (2 more replies)
1 sibling, 3 replies; 25+ messages in thread
From: Collin Funk @ 2025-05-03 4:16 UTC (permalink / raw)
To: git; +Cc: shejialuo, sandals, Collin Funk, Jeff King, Junio C Hamano
As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a
symlink returns -1 and sets errno to EFTYPE which differs from POSIX.
This patch fixes the following test failure:
$ sh t0602-reffiles-fsck.sh --verbose
--- expect 2025-05-02 23:05:23.920890147 +0000
+++ err 2025-05-02 23:05:23.916794959 +0000
@@ -1 +1 @@
-error: packed-refs: badRefFiletype: not a regular file but a symlink
+error: unable to open '.git/packed-refs': Inappropriate file type or format
not ok 12 - the filetype of packed-refs should be checked
FreeBSD has the same issue for EMLINK instead of EFTYPE.
This portability issue was introduced in cfea2f2da8 (packed-backend:
check whether the "packed-refs" is regular file, 2025-02-28)
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
---
wrapper.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/wrapper.c b/wrapper.c
index 3c79778055..f74e3f7747 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -737,7 +737,26 @@ int is_empty_or_missing_file(const char *filename)
int open_nofollow(const char *path, int flags)
{
#ifdef O_NOFOLLOW
- return open(path, flags | O_NOFOLLOW);
+ int ret = open(path, flags | O_NOFOLLOW);
+ /*
+ * NetBSD sets errno to EFTYPE when path is a symlink. The only other
+ * time this errno occurs when O_REGULAR is used. Since we don't use
+ * it anywhere we can avoid an lstat here. FreeBSD does the same with
+ * EMLINK.
+ */
+#ifdef __NetBSD__
+#define SYMLINK_ERRNO EFTYPE
+#elif defined(__FreeBSD__)
+#define SYMLINK_ERRNO EMLINK
+#endif
+#if SYMLINK_ERRNO
+ if (ret < 0 && errno == SYMLINK_ERRNO) {
+ errno = ELOOP;
+ return -1;
+ }
+#undef SYMLINK_ERRNO
+#endif
+ return ret;
#else
struct stat st;
if (lstat(path, &st) < 0)
--
2.49.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 1:05 ` Junio C Hamano
@ 2025-05-03 4:21 ` Collin Funk
2025-05-03 17:32 ` Junio C Hamano
0 siblings, 1 reply; 25+ messages in thread
From: Collin Funk @ 2025-05-03 4:21 UTC (permalink / raw)
To: Junio C Hamano; +Cc: brian m. carlson, git, shejialuo, Jeff King
Junio C Hamano <gitster@pobox.com> writes:
> Thanks, both. Will queue after fixing the proposed log message a
> bit (the sample must be indented, especially when it contains lines
> that look like a patch).
Yes, that looks better. Thanks!
>> I suspect we'll also hit this on FreeBSD, which has a similar issue in
>> that it returns `EMLINK` instead of `ELOOP`.
>
> I won't expect Collin or you to redo this patch to cover FreeBSD;
> anybody with FreeBSD box/vm can do a separate patch on a different
> day.
It is no problem. I have access to a FreeBSD 14.2 machine. I can confirm
that it fails with EMLINK.
My V2 patch fixes the issue on both platforms. From the documentation
OpenBSD shouldn't be an issue. But if so it should be trivial to fix.
Collin
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 0:57 ` brian m. carlson
2025-05-03 1:05 ` Junio C Hamano
2025-05-03 3:48 ` Collin Funk
@ 2025-05-03 13:31 ` Jeff King
2025-05-03 14:58 ` shejialuo
` (2 more replies)
2 siblings, 3 replies; 25+ messages in thread
From: Jeff King @ 2025-05-03 13:31 UTC (permalink / raw)
To: brian m. carlson; +Cc: Collin Funk, git, shejialuo, Junio C Hamano
On Sat, May 03, 2025 at 12:57:11AM +0000, brian m. carlson wrote:
> > diff --git a/wrapper.c b/wrapper.c
> > index 3c79778055..4d448d7c57 100644
> > --- a/wrapper.c
> > +++ b/wrapper.c
> > @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename)
> > int open_nofollow(const char *path, int flags)
> > {
> > #ifdef O_NOFOLLOW
> > - return open(path, flags | O_NOFOLLOW);
> > + int ret = open(path, flags | O_NOFOLLOW);
> > +#ifdef __NetBSD__
> > + /*
> > + * NetBSD sets errno to EFTYPE when path is a symlink. The only other
> > + * time this errno occurs when O_REGULAR is used. Since we don't use
> > + * it anywhere we can avoid an lstat here.
> > + */
> > + if (ret < 0 && errno == EFTYPE) {
> > + errno = ELOOP;
> > + return -1;
> > + }
> > +#endif
> > + return ret;
>
> This patch seems reasonable and correct. I don't use NetBSD, but I do
> often test there, and I'm aware of this infelicity. I'm surprised we
> haven't hit it before.
>
> I suspect we'll also hit this on FreeBSD, which has a similar issue in
> that it returns `EMLINK` instead of `ELOOP`. I do wish these two OSes
> would provide an appropriate POSIX-compatible `open` call when set with
> `_POSIX_SOURCE`, since this is one of the biggest portability problems
> with them.
The inconsistency in errno has been there since open_nofollow() was
added years ago. But we didn't notice it because in general we try not
to be too particular about which errno value we receive.
That changed in cfea2f2da8 (packed-backend: check whether the
"packed-refs" is regular file, 2025-02-28), which uses open_nofollow()
to check for symlinks while we open it. But it feels like it would be
more direct to just lstat() the file in the first place (which we end up
doing anyway to check for other things besides symlinks!).
I.e., I'd think this would just work everywhere:
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 3ad1ed0787..a247220df9 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -2071,35 +2071,32 @@ static int packed_fsck(struct ref_store *ref_store,
if (o->verbose)
fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
- fd = open_nofollow(refs->path, O_RDONLY);
- if (fd < 0) {
+ if (lstat(refs->path, &st) < 0) {
/*
* If the packed-refs file doesn't exist, there's nothing
* to check.
*/
if (errno == ENOENT)
goto cleanup;
- if (errno == ELOOP) {
- struct fsck_ref_report report = { 0 };
- report.path = "packed-refs";
- ret = fsck_report_ref(o, &report,
- FSCK_MSG_BAD_REF_FILETYPE,
- "not a regular file but a symlink");
- goto cleanup;
- }
-
- ret = error_errno(_("unable to open '%s'"), refs->path);
- goto cleanup;
- } else if (fstat(fd, &st) < 0) {
ret = error_errno(_("unable to stat '%s'"), refs->path);
goto cleanup;
- } else if (!S_ISREG(st.st_mode)) {
+ }
+
+ if (!S_ISREG(st.st_mode)) {
struct fsck_ref_report report = { 0 };
report.path = "packed-refs";
ret = fsck_report_ref(o, &report,
FSCK_MSG_BAD_REF_FILETYPE,
"not a regular file");
+ /* XXX optionally could keep going here and actually
+ * check the contents we do find */
+ goto cleanup;
+ }
+
+ fd = open(refs->path, O_RDONLY);
+ if (fd < 0) {
+ ret = error_errno(_("unable to open '%s'"), refs->path);
goto cleanup;
}
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 9d1dc2144c..34d54a7c05 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -632,7 +632,7 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
ln -sf packed-refs-back .git/packed-refs &&
test_must_fail git refs verify 2>err &&
cat >expect <<-EOF &&
- error: packed-refs: badRefFiletype: not a regular file but a symlink
+ error: packed-refs: badRefFiletype: not a regular file
EOF
rm .git/packed-refs &&
test_cmp expect err &&
It's not as "atomic" as open_nofollow() and fstat(), but I don't think
we care about that for fsck. This is about consistency checking, not
trying to beat races against active adversaries (not to mention that our
open_nofollow() is best-effort anyway, and may be racy).
I dunno. I don't mind making errno returns more consistent to prevent a
future foot-gun, but I think as a general rule we may be better off not
looking too hard at errno for exotic conditions.
-Peff
PS I notice that this same function reads the whole packed-refs file
into a strbuf. That may be a problem, as they can grow pretty big in
extreme cases (e.g., GitHub's fork networks easily got into the
gigabytes, as it was every ref of every fork). We usually mmap it.
Not related to this discussion, but just something I noticed while
reading the function.
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 13:31 ` Jeff King
@ 2025-05-03 14:58 ` shejialuo
2025-05-03 15:49 ` Jeff King
2025-05-03 18:56 ` Collin Funk
2025-05-05 15:43 ` Junio C Hamano
2 siblings, 1 reply; 25+ messages in thread
From: shejialuo @ 2025-05-03 14:58 UTC (permalink / raw)
To: Jeff King; +Cc: brian m. carlson, Collin Funk, git, Junio C Hamano
On Sat, May 03, 2025 at 09:31:58AM -0400, Jeff King wrote:
> On Sat, May 03, 2025 at 12:57:11AM +0000, brian m. carlson wrote:
>
> > > diff --git a/wrapper.c b/wrapper.c
> > > index 3c79778055..4d448d7c57 100644
> > > --- a/wrapper.c
> > > +++ b/wrapper.c
> > > @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename)
> > > int open_nofollow(const char *path, int flags)
> > > {
> > > #ifdef O_NOFOLLOW
> > > - return open(path, flags | O_NOFOLLOW);
> > > + int ret = open(path, flags | O_NOFOLLOW);
> > > +#ifdef __NetBSD__
> > > + /*
> > > + * NetBSD sets errno to EFTYPE when path is a symlink. The only other
> > > + * time this errno occurs when O_REGULAR is used. Since we don't use
> > > + * it anywhere we can avoid an lstat here.
> > > + */
> > > + if (ret < 0 && errno == EFTYPE) {
> > > + errno = ELOOP;
> > > + return -1;
> > > + }
> > > +#endif
> > > + return ret;
> >
> > This patch seems reasonable and correct. I don't use NetBSD, but I do
> > often test there, and I'm aware of this infelicity. I'm surprised we
> > haven't hit it before.
> >
> > I suspect we'll also hit this on FreeBSD, which has a similar issue in
> > that it returns `EMLINK` instead of `ELOOP`. I do wish these two OSes
> > would provide an appropriate POSIX-compatible `open` call when set with
> > `_POSIX_SOURCE`, since this is one of the biggest portability problems
> > with them.
>
> The inconsistency in errno has been there since open_nofollow() was
> added years ago. But we didn't notice it because in general we try not
> to be too particular about which errno value we receive.
>
> That changed in cfea2f2da8 (packed-backend: check whether the
> "packed-refs" is regular file, 2025-02-28), which uses open_nofollow()
> to check for symlinks while we open it. But it feels like it would be
> more direct to just lstat() the file in the first place (which we end up
> doing anyway to check for other things besides symlinks!).
>
> I.e., I'd think this would just work everywhere:
>
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 3ad1ed0787..a247220df9 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -2071,35 +2071,32 @@ static int packed_fsck(struct ref_store *ref_store,
> if (o->verbose)
> fprintf_ln(stderr, "Checking packed-refs file %s", refs->path);
>
> - fd = open_nofollow(refs->path, O_RDONLY);
> - if (fd < 0) {
> + if (lstat(refs->path, &st) < 0) {
> /*
> * If the packed-refs file doesn't exist, there's nothing
> * to check.
> */
> if (errno == ENOENT)
> goto cleanup;
>
> - if (errno == ELOOP) {
> - struct fsck_ref_report report = { 0 };
> - report.path = "packed-refs";
> - ret = fsck_report_ref(o, &report,
> - FSCK_MSG_BAD_REF_FILETYPE,
> - "not a regular file but a symlink");
> - goto cleanup;
> - }
> -
> - ret = error_errno(_("unable to open '%s'"), refs->path);
> - goto cleanup;
> - } else if (fstat(fd, &st) < 0) {
> ret = error_errno(_("unable to stat '%s'"), refs->path);
> goto cleanup;
> - } else if (!S_ISREG(st.st_mode)) {
> + }
> +
> + if (!S_ISREG(st.st_mode)) {
> struct fsck_ref_report report = { 0 };
> report.path = "packed-refs";
> ret = fsck_report_ref(o, &report,
> FSCK_MSG_BAD_REF_FILETYPE,
> "not a regular file");
> + /* XXX optionally could keep going here and actually
> + * check the contents we do find */
> + goto cleanup;
> + }
> +
> + fd = open(refs->path, O_RDONLY);
> + if (fd < 0) {
> + ret = error_errno(_("unable to open '%s'"), refs->path);
> goto cleanup;
> }
>
> diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
> index 9d1dc2144c..34d54a7c05 100755
> --- a/t/t0602-reffiles-fsck.sh
> +++ b/t/t0602-reffiles-fsck.sh
> @@ -632,7 +632,7 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' '
> ln -sf packed-refs-back .git/packed-refs &&
> test_must_fail git refs verify 2>err &&
> cat >expect <<-EOF &&
> - error: packed-refs: badRefFiletype: not a regular file but a symlink
> + error: packed-refs: badRefFiletype: not a regular file
> EOF
> rm .git/packed-refs &&
> test_cmp expect err &&
>
> It's not as "atomic" as open_nofollow() and fstat(), but I don't think
> we care about that for fsck. This is about consistency checking, not
> trying to beat races against active adversaries (not to mention that our
> open_nofollow() is best-effort anyway, and may be racy).
>
The motivation why I use "open_nofollow" is that we want to avoid race
conditions as many as possible. However, as you have said, our
"open_nofollow" function still has the race. And it would be a little
cumbersome to make "open_nofollow" compatible. So, I rather prefer that
we simply use this way to solve the problem.
> I dunno. I don't mind making errno returns more consistent to prevent a
> future foot-gun, but I think as a general rule we may be better off not
> looking too hard at errno for exotic conditions.
>
I don't mind either.
> -Peff
>
> PS I notice that this same function reads the whole packed-refs file
> into a strbuf. That may be a problem, as they can grow pretty big in
> extreme cases (e.g., GitHub's fork networks easily got into the
> gigabytes, as it was every ref of every fork). We usually mmap it.
> Not related to this discussion, but just something I noticed while
> reading the function.
Peff, thanks for notifying me. I want to know more background.
Initially, the reason why I don't use `mmap` is that when checking the
ref consistency, we usually don't need to share the "packed-refs"
content for multiple processes via `mmap`.
I don't know how Github executes "git fsck" for the forked repositories.
Is there any regular tasks for "git fsck"? And would "packed-refs" file
be shared for all these repositories?
If above is the case, I agree that we should reuse the logic of
"load_contents" to enhance. But I don't know whether we need to do this
in the first place.
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk
@ 2025-05-03 15:45 ` brian m. carlson
2025-05-03 18:44 ` Collin Funk
2025-05-05 6:43 ` Patrick Steinhardt
2025-05-06 1:08 ` [PATCH v3] " Collin Funk
2 siblings, 1 reply; 25+ messages in thread
From: brian m. carlson @ 2025-05-03 15:45 UTC (permalink / raw)
To: Collin Funk; +Cc: git, shejialuo, Jeff King, Junio C Hamano
[-- Attachment #1: Type: text/plain, Size: 1378 bytes --]
On 2025-05-03 at 04:16:51, Collin Funk wrote:
> As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a
> symlink returns -1 and sets errno to EFTYPE which differs from POSIX.
>
> This patch fixes the following test failure:
>
> $ sh t0602-reffiles-fsck.sh --verbose
> --- expect 2025-05-02 23:05:23.920890147 +0000
> +++ err 2025-05-02 23:05:23.916794959 +0000
> @@ -1 +1 @@
> -error: packed-refs: badRefFiletype: not a regular file but a symlink
> +error: unable to open '.git/packed-refs': Inappropriate file type or format
> not ok 12 - the filetype of packed-refs should be checked
>
> FreeBSD has the same issue for EMLINK instead of EFTYPE.
>
> This portability issue was introduced in cfea2f2da8 (packed-backend:
> check whether the "packed-refs" is regular file, 2025-02-28)
Yup, this looks good. Thanks again for the patch.
I'll just add one resource for people who might like to look into these
kinds of things more. https://man.freebsd.org/cgi/man.cgi is the
FreeBSD man page viewer, which lets you view manual pages from the BSDs,
Linux, and some proprietary Unix systems. It can be quite helpful for
finding and fixing portability issues like this or just seeing what
command-line options or arguments a certain Unix supports.
--
brian m. carlson (they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 325 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 14:58 ` shejialuo
@ 2025-05-03 15:49 ` Jeff King
2025-05-05 6:39 ` Patrick Steinhardt
0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2025-05-03 15:49 UTC (permalink / raw)
To: shejialuo; +Cc: brian m. carlson, Collin Funk, git, Junio C Hamano
On Sat, May 03, 2025 at 10:58:58PM +0800, shejialuo wrote:
> > PS I notice that this same function reads the whole packed-refs file
> > into a strbuf. That may be a problem, as they can grow pretty big in
> > extreme cases (e.g., GitHub's fork networks easily got into the
> > gigabytes, as it was every ref of every fork). We usually mmap it.
> > Not related to this discussion, but just something I noticed while
> > reading the function.
>
> Peff, thanks for notifying me. I want to know more background.
> Initially, the reason why I don't use `mmap` is that when checking the
> ref consistency, we usually don't need to share the "packed-refs"
> content for multiple processes via `mmap`.
You're not sharing with other processes running fsck, but you'd be
sharing the memory with all of the other processes using that
packed-refs file for normal lookups.
But even if it's shared with nobody, reading it all into memory is
strictly worse than just mmap (since the data is getting copied into the
new allocation).
> I don't know how Github executes "git fsck" for the forked repositories.
> Is there any regular tasks for "git fsck"? And would "packed-refs" file
> be shared for all these repositories?
I don't know offhand how often GitHub runs fsck in an automated way
these days. Or even how big packed-refs files get, for that matter.
The specific case I'm thinking of for GitHub is that each fork network
has a master "network.git" repo that stores the objects for all of the
forks (which point to it via their objects/info/alternates files). That
network.git repo doesn't technically need to have all of the refs all
the time, but in practice it wants to know about them for reachability
during repacking, etc.
So it has something like "refs/remotes/<fork_id>/heads/master", and so
on, copying the whole refs/* namespace of each fork. If you look at,
say, torvalds/linux, the refs data for a single fork is probably ~30k or
so (based on looking at what's in a clone). And there are ~55k forks. So
that's around 1.5G. Not a deal-breaker to allocate (keeping in mind they
have pretty beefy systems), but enough that mmap is probably better.
I'm also sure that's not the worst case. It has a lot of forks but the
ref namespace is not that huge compared to some other projects (and it's
the product of the two that is the problem).
> If above is the case, I agree that we should reuse the logic of
> "load_contents" to enhance. But I don't know whether we need to do this
> in the first place.
I think you can skip the stat validity bits. In theory you can also skip
the mmap_strategy stuff, but I guess it might mean that "fsck" could
block other writers on Windows temporarily (though we wouldn't plan to
hold it open long, the way the normal reader does).
The other gotcha is that the result won't be NUL-terminated, but it
looks like the helper functions already take an "eof" pointer to avoid
looking past the end of what was read.
-Peff
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 4:21 ` Collin Funk
@ 2025-05-03 17:32 ` Junio C Hamano
0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2025-05-03 17:32 UTC (permalink / raw)
To: Collin Funk; +Cc: brian m. carlson, git, shejialuo, Jeff King
Collin Funk <collin.funk1@gmail.com> writes:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Thanks, both. Will queue after fixing the proposed log message a
>> bit (the sample must be indented, especially when it contains lines
>> that look like a patch).
>
> Yes, that looks better. Thanks!
>
>>> I suspect we'll also hit this on FreeBSD, which has a similar issue in
>>> that it returns `EMLINK` instead of `ELOOP`.
>>
>> I won't expect Collin or you to redo this patch to cover FreeBSD;
>> anybody with FreeBSD box/vm can do a separate patch on a different
>> day.
>
> It is no problem. I have access to a FreeBSD 14.2 machine. I can confirm
> that it fails with EMLINK.
Excellent.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-03 15:45 ` brian m. carlson
@ 2025-05-03 18:44 ` Collin Funk
0 siblings, 0 replies; 25+ messages in thread
From: Collin Funk @ 2025-05-03 18:44 UTC (permalink / raw)
To: brian m. carlson; +Cc: git, shejialuo, Jeff King, Junio C Hamano
"brian m. carlson" <sandals@crustytoothpaste.net> writes:
> I'll just add one resource for people who might like to look into these
> kinds of things more. https://man.freebsd.org/cgi/man.cgi is the
> FreeBSD man page viewer, which lets you view manual pages from the BSDs,
> Linux, and some proprietary Unix systems. It can be quite helpful for
> finding and fixing portability issues like this or just seeing what
> command-line options or arguments a certain Unix supports.
Gnulib documents most portability quirks too [1]. For example, it had the
FreeBSD EMLINK and NetBSD EFTYPE with 'open("symlink", O_NOFOLLOW ...)
documented, but for some very old versions released around 2014. Now
that I have confirmed it still exists from this git test I have updated
the documentation there [2].
Thanks,
Collin
[1] https://www.gnu.org/software/gnulib/manual/
[2] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=c0c646e29fbda0a6eadd6012d8ed1eb33b6c3968
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 13:31 ` Jeff King
2025-05-03 14:58 ` shejialuo
@ 2025-05-03 18:56 ` Collin Funk
2025-05-05 15:43 ` Junio C Hamano
2 siblings, 0 replies; 25+ messages in thread
From: Collin Funk @ 2025-05-03 18:56 UTC (permalink / raw)
To: Jeff King; +Cc: brian m. carlson, git, shejialuo, Junio C Hamano
Jeff King <peff@peff.net> writes:
> I dunno. I don't mind making errno returns more consistent to prevent a
> future foot-gun, but I think as a general rule we may be better off not
> looking too hard at errno for exotic conditions.
I generally agree. But in this case FreeBSD only sets errno to EMLINK in
this specific case and the only other case NetBSD sets errno to EFTYPE
is when the O_REGULAR flag is used and the path is not a regular file.
Using 'grep -r O_REGULAR' confirms it is never used in git, and since it
is a NetBSD extension I doubt it will ever be used. So not an exotic
case, in my opinion.
Thanks,
Collin
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 15:49 ` Jeff King
@ 2025-05-05 6:39 ` Patrick Steinhardt
2025-05-05 12:17 ` shejialuo
0 siblings, 1 reply; 25+ messages in thread
From: Patrick Steinhardt @ 2025-05-05 6:39 UTC (permalink / raw)
To: Jeff King; +Cc: shejialuo, brian m. carlson, Collin Funk, git, Junio C Hamano
On Sat, May 03, 2025 at 11:49:28AM -0400, Jeff King wrote:
> On Sat, May 03, 2025 at 10:58:58PM +0800, shejialuo wrote:
>
> > > PS I notice that this same function reads the whole packed-refs file
> > > into a strbuf. That may be a problem, as they can grow pretty big in
> > > extreme cases (e.g., GitHub's fork networks easily got into the
> > > gigabytes, as it was every ref of every fork). We usually mmap it.
> > > Not related to this discussion, but just something I noticed while
> > > reading the function.
> >
> > Peff, thanks for notifying me. I want to know more background.
> > Initially, the reason why I don't use `mmap` is that when checking the
> > ref consistency, we usually don't need to share the "packed-refs"
> > content for multiple processes via `mmap`.
>
> You're not sharing with other processes running fsck, but you'd be
> sharing the memory with all of the other processes using that
> packed-refs file for normal lookups.
>
> But even if it's shared with nobody, reading it all into memory is
> strictly worse than just mmap (since the data is getting copied into the
> new allocation).
>
> > I don't know how Github executes "git fsck" for the forked repositories.
> > Is there any regular tasks for "git fsck"? And would "packed-refs" file
> > be shared for all these repositories?
>
> I don't know offhand how often GitHub runs fsck in an automated way
> these days. Or even how big packed-refs files get, for that matter.
They typically are at most a couple of megabytes, but there certainly
are outliers. For as at GitLab.com, the vast majority (>99%) of such
files is less than 50MB and typically even less than 5MB.
> The specific case I'm thinking of for GitHub is that each fork network
> has a master "network.git" repo that stores the objects for all of the
> forks (which point to it via their objects/info/alternates files). That
> network.git repo doesn't technically need to have all of the refs all
> the time, but in practice it wants to know about them for reachability
> during repacking, etc.
>
> So it has something like "refs/remotes/<fork_id>/heads/master", and so
> on, copying the whole refs/* namespace of each fork. If you look at,
> say, torvalds/linux, the refs data for a single fork is probably ~30k or
> so (based on looking at what's in a clone). And there are ~55k forks. So
> that's around 1.5G. Not a deal-breaker to allocate (keeping in mind they
> have pretty beefy systems), but enough that mmap is probably better.
>
> I'm also sure that's not the worst case. It has a lot of forks but the
> ref namespace is not that huge compared to some other projects (and it's
> the product of the two that is the problem).
Yeah, the interesting case is always the outliers. One of the worst
offenders we have at GitLab.com is our own "gitlab-org/gitlab"
repository. This particular repository has a "packed-refs" file that is
around 2GB in size.
So I think refactoring this code to use `mmap()` would probably make
sense.
Patrick
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk
2025-05-03 15:45 ` brian m. carlson
@ 2025-05-05 6:43 ` Patrick Steinhardt
2025-05-05 20:41 ` Junio C Hamano
2025-05-06 1:08 ` [PATCH v3] " Collin Funk
2 siblings, 1 reply; 25+ messages in thread
From: Patrick Steinhardt @ 2025-05-05 6:43 UTC (permalink / raw)
To: Collin Funk; +Cc: git, shejialuo, sandals, Jeff King, Junio C Hamano
On Fri, May 02, 2025 at 09:16:51PM -0700, Collin Funk wrote:
> As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a
> symlink returns -1 and sets errno to EFTYPE which differs from POSIX.
>
> This patch fixes the following test failure:
>
> $ sh t0602-reffiles-fsck.sh --verbose
> --- expect 2025-05-02 23:05:23.920890147 +0000
> +++ err 2025-05-02 23:05:23.916794959 +0000
> @@ -1 +1 @@
> -error: packed-refs: badRefFiletype: not a regular file but a symlink
> +error: unable to open '.git/packed-refs': Inappropriate file type or format
> not ok 12 - the filetype of packed-refs should be checked
>
> FreeBSD has the same issue for EMLINK instead of EFTYPE.
>
> This portability issue was introduced in cfea2f2da8 (packed-backend:
> check whether the "packed-refs" is regular file, 2025-02-28)
Ok, makes sense.
> diff --git a/wrapper.c b/wrapper.c
> index 3c79778055..f74e3f7747 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -737,7 +737,26 @@ int is_empty_or_missing_file(const char *filename)
> int open_nofollow(const char *path, int flags)
> {
> #ifdef O_NOFOLLOW
> - return open(path, flags | O_NOFOLLOW);
> + int ret = open(path, flags | O_NOFOLLOW);
> + /*
> + * NetBSD sets errno to EFTYPE when path is a symlink. The only other
> + * time this errno occurs when O_REGULAR is used. Since we don't use
> + * it anywhere we can avoid an lstat here. FreeBSD does the same with
> + * EMLINK.
> + */
> +#ifdef __NetBSD__
> +#define SYMLINK_ERRNO EFTYPE
> +#elif defined(__FreeBSD__)
> +#define SYMLINK_ERRNO EMLINK
> +#endif
Nit, to make this a bit easier to read: our style guide says that nested
preprocessor directives should be indented by one spaces. So this would
become:
# ifdef __NetBSD__
# define SYMLINK_ERRNO EFTYPE
# elif defined(__FreeBSD__)
# define SYMLINK_ERRNO EMLINK
# endif
Note that the `ifdef` itself would also be indented because we already
have a surrounding `#ifdef O_NOFOLLOW`.
> +#if SYMLINK_ERRNO
> + if (ret < 0 && errno == SYMLINK_ERRNO) {
> + errno = ELOOP;
> + return -1;
> + }
> +#undef SYMLINK_ERRNO
> +#endif
These three preprocessor defines should be indented, as well.
Patrick
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-05 6:39 ` Patrick Steinhardt
@ 2025-05-05 12:17 ` shejialuo
0 siblings, 0 replies; 25+ messages in thread
From: shejialuo @ 2025-05-05 12:17 UTC (permalink / raw)
To: Patrick Steinhardt
Cc: Jeff King, brian m. carlson, Collin Funk, git, Junio C Hamano
On Mon, May 05, 2025 at 08:39:27AM +0200, Patrick Steinhardt wrote:
> On Sat, May 03, 2025 at 11:49:28AM -0400, Jeff King wrote:
> > On Sat, May 03, 2025 at 10:58:58PM +0800, shejialuo wrote:
> >
> > > > PS I notice that this same function reads the whole packed-refs file
> > > > into a strbuf. That may be a problem, as they can grow pretty big in
> > > > extreme cases (e.g., GitHub's fork networks easily got into the
> > > > gigabytes, as it was every ref of every fork). We usually mmap it.
> > > > Not related to this discussion, but just something I noticed while
> > > > reading the function.
> > >
> > > Peff, thanks for notifying me. I want to know more background.
> > > Initially, the reason why I don't use `mmap` is that when checking the
> > > ref consistency, we usually don't need to share the "packed-refs"
> > > content for multiple processes via `mmap`.
> >
> > You're not sharing with other processes running fsck, but you'd be
> > sharing the memory with all of the other processes using that
> > packed-refs file for normal lookups.
> >
> > But even if it's shared with nobody, reading it all into memory is
> > strictly worse than just mmap (since the data is getting copied into the
> > new allocation).
> >
> > > I don't know how Github executes "git fsck" for the forked repositories.
> > > Is there any regular tasks for "git fsck"? And would "packed-refs" file
> > > be shared for all these repositories?
> >
> > I don't know offhand how often GitHub runs fsck in an automated way
> > these days. Or even how big packed-refs files get, for that matter.
>
> They typically are at most a couple of megabytes, but there certainly
> are outliers. For as at GitLab.com, the vast majority (>99%) of such
> files is less than 50MB and typically even less than 5MB.
>
> > The specific case I'm thinking of for GitHub is that each fork network
> > has a master "network.git" repo that stores the objects for all of the
> > forks (which point to it via their objects/info/alternates files). That
> > network.git repo doesn't technically need to have all of the refs all
> > the time, but in practice it wants to know about them for reachability
> > during repacking, etc.
> >
> > So it has something like "refs/remotes/<fork_id>/heads/master", and so
> > on, copying the whole refs/* namespace of each fork. If you look at,
> > say, torvalds/linux, the refs data for a single fork is probably ~30k or
> > so (based on looking at what's in a clone). And there are ~55k forks. So
> > that's around 1.5G. Not a deal-breaker to allocate (keeping in mind they
> > have pretty beefy systems), but enough that mmap is probably better.
> >
> > I'm also sure that's not the worst case. It has a lot of forks but the
> > ref namespace is not that huge compared to some other projects (and it's
> > the product of the two that is the problem).
>
> Yeah, the interesting case is always the outliers. One of the worst
> offenders we have at GitLab.com is our own "gitlab-org/gitlab"
> repository. This particular repository has a "packed-refs" file that is
> around 2GB in size.
>
> So I think refactoring this code to use `mmap()` would probably make
> sense.
>
Thank Peff and Patrick for the information. I will send a patch later.
> Patrick
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-03 13:31 ` Jeff King
2025-05-03 14:58 ` shejialuo
2025-05-03 18:56 ` Collin Funk
@ 2025-05-05 15:43 ` Junio C Hamano
2025-05-05 18:03 ` Jeff King
2 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2025-05-05 15:43 UTC (permalink / raw)
To: Jeff King; +Cc: brian m. carlson, Collin Funk, git, shejialuo
Jeff King <peff@peff.net> writes:
> That changed in cfea2f2da8 (packed-backend: check whether the
> "packed-refs" is regular file, 2025-02-28), which uses open_nofollow()
> to check for symlinks while we open it. But it feels like it would be
> more direct to just lstat() the file in the first place (which we end up
> doing anyway to check for other things besides symlinks!).
> ...
> It's not as "atomic" as open_nofollow() and fstat(), but I don't think
> we care about that for fsck. This is about consistency checking, not
> trying to beat races against active adversaries (not to mention that our
> open_nofollow() is best-effort anyway, and may be racy).
True. Atomicity, which the use of open_nofollow() and fstat() tries
to achieve, may not matter in fsck. We can think of the use of
open_nofollow() in this particular codepath merely as a convenient
helper function, and I do not think we have any problem with such a
helper function.
But open_nofollow() and its emulation can be called from other
codepaths that may care about atomicity, and I am not sure what our
attitude towards atomicity requirements vs platform capabilities
should be.
If an atomicity (or any other) requirement in a particular codepath
has a simple and obvious way to solve on common platforms, but that
the mechanism to implement the simple and obvious way is unavailable
on other platforms, where does it lead us?
For some kind of requirements, we can treat it merely as a quality
of implementation issue, similar to how finalize_object_file()
ideally wants to do the create(TMP) then link(TMP->FINAL) then
unlink(TMP) dance (because we want to detect collisions when able)
but has fallback implementation to create(TMP) then
rename(TMP->FINAL) (which punts on collision detection) on platforms
where the preferred way does not work. It falls into this category,
I would think, to think of use of open_nofollow() in this codepath
as a mere helper function that makes the code in fsck shorter.
But for other kind of requirements, we want to fulfill them on all
platforms that we claim to support. Using open_nofollow() to
achieve hard atomicity requirement would be a bug in such a
situation. Should we somehow warn our developers against its use?
Idealists in us first try hard to find the right abstraction that
would work everywhere, and use compat/ layer to implement that
abstraction, but we of course are often not successful, and end up
with a series of #ifdef for pieces of platform-specific code in
fairly higher layer. It feels that open_nofollow() that is not
necesarily atomic is the latter but that is done at a level that is
a bit too low. I dunno.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-05 15:43 ` Junio C Hamano
@ 2025-05-05 18:03 ` Jeff King
2025-05-06 13:43 ` shejialuo
2025-05-06 22:58 ` Junio C Hamano
0 siblings, 2 replies; 25+ messages in thread
From: Jeff King @ 2025-05-05 18:03 UTC (permalink / raw)
To: Junio C Hamano; +Cc: brian m. carlson, Collin Funk, git, shejialuo
On Mon, May 05, 2025 at 08:43:18AM -0700, Junio C Hamano wrote:
> But for other kind of requirements, we want to fulfill them on all
> platforms that we claim to support. Using open_nofollow() to
> achieve hard atomicity requirement would be a bug in such a
> situation. Should we somehow warn our developers against its use?
The comment above the declaration says:
/*
* Open with O_NOFOLLOW, or equivalent. Note that the fallback equivalent
* may be racy. Do not use this as protection against an attacker who can
* simultaneously create paths.
*/
int open_nofollow(const char *path, int flags);
though that may not be enough. 00611d8440 (add open_nofollow() helper,
2021-02-16) discusses a way that it could be made less racy, at a
slightly increased cost.
IMHO that is somewhat orthogonal to the issue here, though, which is
purely about the case where O_NOFOLLOW does exist (ironically, our
racy fallback code consistently returns ELOOP ;) ).
The issue at hand is that particular errno responses are not always
portable. The patch discussed here improves that. My point was more that
I'm not sure to what degree we should care about errno consistency in
our wrappers (which is inherently a bit whack-a-mole as we find new
cases), versus trying not to care too hard about specific errno values
in calling code.
I can see arguments either way (and as I said, an argument for making
errno values consistent even if we try to rely on them less). Mostly I
was just a little surprised to see open_nofollow() being used in this
way (especially since we have to end up stat()-ing anyway to check for
other cases).
-Peff
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-05 6:43 ` Patrick Steinhardt
@ 2025-05-05 20:41 ` Junio C Hamano
2025-05-06 1:16 ` Collin Funk
0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2025-05-05 20:41 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Collin Funk, git, shejialuo, sandals, Jeff King
Patrick Steinhardt <ps@pks.im> writes:
>> +#ifdef __NetBSD__
>> +#define SYMLINK_ERRNO EFTYPE
>> +#elif defined(__FreeBSD__)
>> +#define SYMLINK_ERRNO EMLINK
>> +#endif
>
> Nit, to make this a bit easier to read: our style guide says that nested
> preprocessor directives should be indented by one spaces. So this would
> become:
>
> # ifdef __NetBSD__
> # define SYMLINK_ERRNO EFTYPE
> # elif defined(__FreeBSD__)
> # define SYMLINK_ERRNO EMLINK
> # endif
>
> Note that the `ifdef` itself would also be indented because we already
> have a surrounding `#ifdef O_NOFOLLOW`.
Hmph, it does look easier to read. I think we used to have some
outlier files that indented CPP directives by prefixing spaces in
front of the whole line, but these days we standardized to express
the indentation by inserting spaces immediately after '#' that
always sit at the beginning of line, so what you showed here is a
good example to mimic.
Thanks.
>
>> +#if SYMLINK_ERRNO
>> + if (ret < 0 && errno == SYMLINK_ERRNO) {
>> + errno = ELOOP;
>> + return -1;
>> + }
>> +#undef SYMLINK_ERRNO
>> +#endif
>
> These three preprocessor defines should be indented, as well.
>
> Patrick
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk
2025-05-03 15:45 ` brian m. carlson
2025-05-05 6:43 ` Patrick Steinhardt
@ 2025-05-06 1:08 ` Collin Funk
2025-05-06 13:24 ` Patrick Steinhardt
2 siblings, 1 reply; 25+ messages in thread
From: Collin Funk @ 2025-05-06 1:08 UTC (permalink / raw)
To: git
Cc: shejialuo, sandals, Jeff King, Junio C Hamano, Patrick Steinhardt,
Collin Funk
As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a
symlink returns -1 and sets errno to EFTYPE which differs from POSIX.
This patch fixes the following test failure:
$ sh t0602-reffiles-fsck.sh --verbose
--- expect 2025-05-02 23:05:23.920890147 +0000
+++ err 2025-05-02 23:05:23.916794959 +0000
@@ -1 +1 @@
-error: packed-refs: badRefFiletype: not a regular file but a symlink
+error: unable to open '.git/packed-refs': Inappropriate file type or format
not ok 12 - the filetype of packed-refs should be checked
FreeBSD has the same issue for EMLINK instead of EFTYPE.
This portability issue was introduced in cfea2f2da8 (packed-backend:
check whether the "packed-refs" is regular file, 2025-02-28)
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
---
wrapper.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/wrapper.c b/wrapper.c
index 8b98593149..38fce5327a 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -737,7 +737,26 @@ int is_empty_or_missing_file(const char *filename)
int open_nofollow(const char *path, int flags)
{
#ifdef O_NOFOLLOW
- return open(path, flags | O_NOFOLLOW);
+ int ret = open(path, flags | O_NOFOLLOW);
+ /*
+ * NetBSD sets errno to EFTYPE when path is a symlink. The only other
+ * time this errno occurs when O_REGULAR is used. Since we don't use
+ * it anywhere we can avoid an lstat here. FreeBSD does the same with
+ * EMLINK.
+ */
+# ifdef __NetBSD__
+# define SYMLINK_ERRNO EFTYPE
+# elif defined(__FreeBSD__)
+# define SYMLINK_ERRNO EMLINK
+# endif
+# if SYMLINK_ERRNO
+ if (ret < 0 && errno == SYMLINK_ERRNO) {
+ errno = ELOOP;
+ return -1;
+ }
+# undef SYMLINK_ERRNO
+# endif
+ return ret;
#else
struct stat st;
if (lstat(path, &st) < 0)
--
2.49.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-05 20:41 ` Junio C Hamano
@ 2025-05-06 1:16 ` Collin Funk
2025-05-06 13:23 ` Patrick Steinhardt
0 siblings, 1 reply; 25+ messages in thread
From: Collin Funk @ 2025-05-06 1:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git, shejialuo, sandals, Jeff King
Hi all,
Junio C Hamano <gitster@pobox.com> writes:
>> Nit, to make this a bit easier to read: our style guide says that nested
>> preprocessor directives should be indented by one spaces. So this would
>> become:
>>
>> # ifdef __NetBSD__
>> # define SYMLINK_ERRNO EFTYPE
>> # elif defined(__FreeBSD__)
>> # define SYMLINK_ERRNO EMLINK
>> # endif
>>
>> Note that the `ifdef` itself would also be indented because we already
>> have a surrounding `#ifdef O_NOFOLLOW`.
>
> Hmph, it does look easier to read. I think we used to have some
> outlier files that indented CPP directives by prefixing spaces in
> front of the whole line, but these days we standardized to express
> the indentation by inserting spaces immediately after '#' that
> always sit at the beginning of line, so what you showed here is a
> good example to mimic.
No problem, I sent V3 with the suggested changes. That is actually my
preferred why of indenting preprocessor directives. But I saw a mix if
CPP indenting, so I was unsure what was correct. I guess I could have
looked harder for a style guide, but at least hopefully I followed
'SubmittingPatches' mostly correct. :)
Collin
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-06 1:16 ` Collin Funk
@ 2025-05-06 13:23 ` Patrick Steinhardt
0 siblings, 0 replies; 25+ messages in thread
From: Patrick Steinhardt @ 2025-05-06 13:23 UTC (permalink / raw)
To: Collin Funk; +Cc: Junio C Hamano, git, shejialuo, sandals, Jeff King
On Mon, May 05, 2025 at 06:16:03PM -0700, Collin Funk wrote:
> Hi all,
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> >> Nit, to make this a bit easier to read: our style guide says that nested
> >> preprocessor directives should be indented by one spaces. So this would
> >> become:
> >>
> >> # ifdef __NetBSD__
> >> # define SYMLINK_ERRNO EFTYPE
> >> # elif defined(__FreeBSD__)
> >> # define SYMLINK_ERRNO EMLINK
> >> # endif
> >>
> >> Note that the `ifdef` itself would also be indented because we already
> >> have a surrounding `#ifdef O_NOFOLLOW`.
> >
> > Hmph, it does look easier to read. I think we used to have some
> > outlier files that indented CPP directives by prefixing spaces in
> > front of the whole line, but these days we standardized to express
> > the indentation by inserting spaces immediately after '#' that
> > always sit at the beginning of line, so what you showed here is a
> > good example to mimic.
>
> No problem, I sent V3 with the suggested changes. That is actually my
> preferred why of indenting preprocessor directives. But I saw a mix if
> CPP indenting, so I was unsure what was correct. I guess I could have
> looked harder for a style guide, but at least hopefully I followed
> 'SubmittingPatches' mostly correct. :)
Yeah, the rule was only introduced rather recently in 7df3f55b92e
(Documentation: clarify indentation style for C preprocessor directives,
2024-07-30), so we're still wildly inconsistent.
Patrick
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v3] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP
2025-05-06 1:08 ` [PATCH v3] " Collin Funk
@ 2025-05-06 13:24 ` Patrick Steinhardt
0 siblings, 0 replies; 25+ messages in thread
From: Patrick Steinhardt @ 2025-05-06 13:24 UTC (permalink / raw)
To: Collin Funk; +Cc: git, shejialuo, sandals, Jeff King, Junio C Hamano
On Mon, May 05, 2025 at 06:08:59PM -0700, Collin Funk wrote:
> As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a
> symlink returns -1 and sets errno to EFTYPE which differs from POSIX.
>
> This patch fixes the following test failure:
>
> $ sh t0602-reffiles-fsck.sh --verbose
> --- expect 2025-05-02 23:05:23.920890147 +0000
> +++ err 2025-05-02 23:05:23.916794959 +0000
> @@ -1 +1 @@
> -error: packed-refs: badRefFiletype: not a regular file but a symlink
> +error: unable to open '.git/packed-refs': Inappropriate file type or format
> not ok 12 - the filetype of packed-refs should be checked
>
> FreeBSD has the same issue for EMLINK instead of EFTYPE.
>
> This portability issue was introduced in cfea2f2da8 (packed-backend:
> check whether the "packed-refs" is regular file, 2025-02-28)
Thanks, this version addresses my nit.
Patrick
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-05 18:03 ` Jeff King
@ 2025-05-06 13:43 ` shejialuo
2025-05-06 22:58 ` Junio C Hamano
1 sibling, 0 replies; 25+ messages in thread
From: shejialuo @ 2025-05-06 13:43 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, brian m. carlson, Collin Funk, git
On Mon, May 05, 2025 at 02:03:11PM -0400, Jeff King wrote:
> On Mon, May 05, 2025 at 08:43:18AM -0700, Junio C Hamano wrote:
>
> > But for other kind of requirements, we want to fulfill them on all
> > platforms that we claim to support. Using open_nofollow() to
> > achieve hard atomicity requirement would be a bug in such a
> > situation. Should we somehow warn our developers against its use?
>
> The comment above the declaration says:
>
> /*
> * Open with O_NOFOLLOW, or equivalent. Note that the fallback equivalent
> * may be racy. Do not use this as protection against an attacker who can
> * simultaneously create paths.
> */
> int open_nofollow(const char *path, int flags);
>
> though that may not be enough. 00611d8440 (add open_nofollow() helper,
> 2021-02-16) discusses a way that it could be made less racy, at a
> slightly increased cost.
>
> IMHO that is somewhat orthogonal to the issue here, though, which is
> purely about the case where O_NOFOLLOW does exist (ironically, our
> racy fallback code consistently returns ELOOP ;) ).
>
> The issue at hand is that particular errno responses are not always
> portable. The patch discussed here improves that. My point was more that
> I'm not sure to what degree we should care about errno consistency in
> our wrappers (which is inherently a bit whack-a-mole as we find new
> cases), versus trying not to care too hard about specific errno values
> in calling code.
>
> I can see arguments either way (and as I said, an argument for making
> errno values consistent even if we try to rely on them less). Mostly I
> was just a little surprised to see open_nofollow() being used in this
> way (especially since we have to end up stat()-ing anyway to check for
> other cases).
>
IIRC, we wanted to try our best to make our code consistent. In the very
early implementation, I actually firstly checked the file type and then
opened the file.
However, there is a chance that the raw "packed-refs" file could be
converted to symlink between checking the filetype and opening the file
to get the fd. Although, in fsck, we may just ignore this. But during
the review, I found out that using "open_nofollow" could avoid race in
some platforms. Sadly, I haven't realized that this would break
compatibility ;)
Because using "open_nofollow" could only check whether the filetype is
the symlink, we also need to use "stat" again to check whether the
filetype is OK. I agree that it is a little redundant.
Since the patch from Collin would solve the problem. I won't change the
logic. I'll focus on using `mmap` to open the "packed-refs" file.
> -Peff
Thanks,
Jialuo
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD.
2025-05-05 18:03 ` Jeff King
2025-05-06 13:43 ` shejialuo
@ 2025-05-06 22:58 ` Junio C Hamano
1 sibling, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2025-05-06 22:58 UTC (permalink / raw)
To: Jeff King; +Cc: brian m. carlson, Collin Funk, git, shejialuo
Jeff King <peff@peff.net> writes:
> On Mon, May 05, 2025 at 08:43:18AM -0700, Junio C Hamano wrote:
>
>> But for other kind of requirements, we want to fulfill them on all
>> platforms that we claim to support. Using open_nofollow() to
>> achieve hard atomicity requirement would be a bug in such a
>> situation. Should we somehow warn our developers against its use?
>
> The comment above the declaration says:
>
> /*
> * Open with O_NOFOLLOW, or equivalent. Note that the fallback equivalent
> * may be racy. Do not use this as protection against an attacker who can
> * simultaneously create paths.
> */
> int open_nofollow(const char *path, int flags);
>
> though that may not be enough. 00611d8440 (add open_nofollow() helper,
> 2021-02-16) discusses a way that it could be made less racy, at a
> slightly increased cost.
>
> IMHO that is somewhat orthogonal to the issue here, though, which is
> purely about the case where O_NOFOLLOW does exist (ironically, our
> racy fallback code consistently returns ELOOP ;) ).
Yup. And with the above comment, I would say that my worries above
are unfounded.
> The issue at hand is that particular errno responses are not always
> portable. The patch discussed here improves that. My point was more that
> I'm not sure to what degree we should care about errno consistency in
> our wrappers (which is inherently a bit whack-a-mole as we find new
> cases), versus trying not to care too hard about specific errno values
> in calling code.
Yeah, it does give me a bad aftertaste having to pretend (by adding
compat code to translate as needed) that all systems share the same
set of errno, but we live in not-so-ideal world, so I am afraild
that it cannot be avoided.
> I can see arguments either way (and as I said, an argument for making
> errno values consistent even if we try to rely on them less). Mostly I
> was just a little surprised to see open_nofollow() being used in this
> way (especially since we have to end up stat()-ing anyway to check for
> other cases).
That is true.
The callers in attr.c and dir.c do want to fstat() after they open,
but they are more interested in atomicity (with "the best effort on
less capable systems" attitude). The one in mailmap.c doesn't do
any stat(), but again it is more about atomisity with the same
attitude, I think.
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2025-05-06 22:58 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-02 23:33 [PATCH] wrapper: Fix a errno discrepancy on NetBSD Collin Funk
2025-05-03 0:57 ` brian m. carlson
2025-05-03 1:05 ` Junio C Hamano
2025-05-03 4:21 ` Collin Funk
2025-05-03 17:32 ` Junio C Hamano
2025-05-03 3:48 ` Collin Funk
2025-05-03 13:31 ` Jeff King
2025-05-03 14:58 ` shejialuo
2025-05-03 15:49 ` Jeff King
2025-05-05 6:39 ` Patrick Steinhardt
2025-05-05 12:17 ` shejialuo
2025-05-03 18:56 ` Collin Funk
2025-05-05 15:43 ` Junio C Hamano
2025-05-05 18:03 ` Jeff King
2025-05-06 13:43 ` shejialuo
2025-05-06 22:58 ` Junio C Hamano
2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk
2025-05-03 15:45 ` brian m. carlson
2025-05-03 18:44 ` Collin Funk
2025-05-05 6:43 ` Patrick Steinhardt
2025-05-05 20:41 ` Junio C Hamano
2025-05-06 1:16 ` Collin Funk
2025-05-06 13:23 ` Patrick Steinhardt
2025-05-06 1:08 ` [PATCH v3] " Collin Funk
2025-05-06 13:24 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).