* [PATCH] wrapper: Fix a errno discrepancy on NetBSD. @ 2025-05-02 23:33 Collin Funk 2025-05-03 0:57 ` brian m. carlson 2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk 0 siblings, 2 replies; 25+ messages in thread From: Collin Funk @ 2025-05-02 23:33 UTC (permalink / raw) To: git; +Cc: shejialuo, Collin Funk, Jeff King, Junio C Hamano As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a symlink returns -1 and sets errno to EFTYPE which differs from POSIX. This patch fixes the following test failure: $ sh t0602-reffiles-fsck.sh --verbose --- expect 2025-05-02 23:05:23.920890147 +0000 +++ err 2025-05-02 23:05:23.916794959 +0000 @@ -1 +1 @@ -error: packed-refs: badRefFiletype: not a regular file but a symlink +error: unable to open '.git/packed-refs': Inappropriate file type or format not ok 12 - the filetype of packed-refs should be checked This portability issue was introduced in Commit cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28) Signed-off-by: Collin Funk <collin.funk1@gmail.com> --- wrapper.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/wrapper.c b/wrapper.c index 3c79778055..4d448d7c57 100644 --- a/wrapper.c +++ b/wrapper.c @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename) int open_nofollow(const char *path, int flags) { #ifdef O_NOFOLLOW - return open(path, flags | O_NOFOLLOW); + int ret = open(path, flags | O_NOFOLLOW); +#ifdef __NetBSD__ + /* + * NetBSD sets errno to EFTYPE when path is a symlink. The only other + * time this errno occurs when O_REGULAR is used. Since we don't use + * it anywhere we can avoid an lstat here. + */ + if (ret < 0 && errno == EFTYPE) { + errno = ELOOP; + return -1; + } +#endif + return ret; #else struct stat st; if (lstat(path, &st) < 0) -- 2.49.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-02 23:33 [PATCH] wrapper: Fix a errno discrepancy on NetBSD Collin Funk @ 2025-05-03 0:57 ` brian m. carlson 2025-05-03 1:05 ` Junio C Hamano ` (2 more replies) 2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk 1 sibling, 3 replies; 25+ messages in thread From: brian m. carlson @ 2025-05-03 0:57 UTC (permalink / raw) To: Collin Funk; +Cc: git, shejialuo, Jeff King, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 2154 bytes --] On 2025-05-02 at 23:33:32, Collin Funk wrote: > As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a > symlink returns -1 and sets errno to EFTYPE which differs from POSIX. > This patch fixes the following test failure: > > $ sh t0602-reffiles-fsck.sh --verbose > --- expect 2025-05-02 23:05:23.920890147 +0000 > +++ err 2025-05-02 23:05:23.916794959 +0000 > @@ -1 +1 @@ > -error: packed-refs: badRefFiletype: not a regular file but a symlink > +error: unable to open '.git/packed-refs': Inappropriate file type or format > not ok 12 - the filetype of packed-refs should be checked > > This portability issue was introduced in Commit > cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28) > > Signed-off-by: Collin Funk <collin.funk1@gmail.com> > --- > wrapper.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/wrapper.c b/wrapper.c > index 3c79778055..4d448d7c57 100644 > --- a/wrapper.c > +++ b/wrapper.c > @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename) > int open_nofollow(const char *path, int flags) > { > #ifdef O_NOFOLLOW > - return open(path, flags | O_NOFOLLOW); > + int ret = open(path, flags | O_NOFOLLOW); > +#ifdef __NetBSD__ > + /* > + * NetBSD sets errno to EFTYPE when path is a symlink. The only other > + * time this errno occurs when O_REGULAR is used. Since we don't use > + * it anywhere we can avoid an lstat here. > + */ > + if (ret < 0 && errno == EFTYPE) { > + errno = ELOOP; > + return -1; > + } > +#endif > + return ret; This patch seems reasonable and correct. I don't use NetBSD, but I do often test there, and I'm aware of this infelicity. I'm surprised we haven't hit it before. I suspect we'll also hit this on FreeBSD, which has a similar issue in that it returns `EMLINK` instead of `ELOOP`. I do wish these two OSes would provide an appropriate POSIX-compatible `open` call when set with `_POSIX_SOURCE`, since this is one of the biggest portability problems with them. -- brian m. carlson (they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 325 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 0:57 ` brian m. carlson @ 2025-05-03 1:05 ` Junio C Hamano 2025-05-03 4:21 ` Collin Funk 2025-05-03 3:48 ` Collin Funk 2025-05-03 13:31 ` Jeff King 2 siblings, 1 reply; 25+ messages in thread From: Junio C Hamano @ 2025-05-03 1:05 UTC (permalink / raw) To: brian m. carlson; +Cc: Collin Funk, git, shejialuo, Jeff King "brian m. carlson" <sandals@crustytoothpaste.net> writes: > On 2025-05-02 at 23:33:32, Collin Funk wrote: >> As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a >> symlink returns -1 and sets errno to EFTYPE which differs from POSIX. >> This patch fixes the following test failure: >> >> $ sh t0602-reffiles-fsck.sh --verbose >> --- expect 2025-05-02 23:05:23.920890147 +0000 >> +++ err 2025-05-02 23:05:23.916794959 +0000 >> @@ -1 +1 @@ >> -error: packed-refs: badRefFiletype: not a regular file but a symlink >> +error: unable to open '.git/packed-refs': Inappropriate file type or format >> not ok 12 - the filetype of packed-refs should be checked >> >> This portability issue was introduced in Commit >> cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28) >> >> Signed-off-by: Collin Funk <collin.funk1@gmail.com> >> --- >> wrapper.c | 14 +++++++++++++- >> 1 file changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/wrapper.c b/wrapper.c >> index 3c79778055..4d448d7c57 100644 >> --- a/wrapper.c >> +++ b/wrapper.c >> @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename) >> int open_nofollow(const char *path, int flags) >> { >> #ifdef O_NOFOLLOW >> - return open(path, flags | O_NOFOLLOW); >> + int ret = open(path, flags | O_NOFOLLOW); >> +#ifdef __NetBSD__ >> + /* >> + * NetBSD sets errno to EFTYPE when path is a symlink. The only other >> + * time this errno occurs when O_REGULAR is used. Since we don't use >> + * it anywhere we can avoid an lstat here. >> + */ >> + if (ret < 0 && errno == EFTYPE) { >> + errno = ELOOP; >> + return -1; >> + } >> +#endif >> + return ret; > > This patch seems reasonable and correct. I don't use NetBSD, but I do > often test there, and I'm aware of this infelicity. I'm surprised we > haven't hit it before. Thanks, both. Will queue after fixing the proposed log message a bit (the sample must be indented, especially when it contains lines that look like a patch). > I suspect we'll also hit this on FreeBSD, which has a similar issue in > that it returns `EMLINK` instead of `ELOOP`. I won't expect Collin or you to redo this patch to cover FreeBSD; anybody with FreeBSD box/vm can do a separate patch on a different day. > I do wish these two OSes > would provide an appropriate POSIX-compatible `open` call when set with > `_POSIX_SOURCE`, since this is one of the biggest portability problems > with them. That may be true, but not something we can fix here X-<. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 1:05 ` Junio C Hamano @ 2025-05-03 4:21 ` Collin Funk 2025-05-03 17:32 ` Junio C Hamano 0 siblings, 1 reply; 25+ messages in thread From: Collin Funk @ 2025-05-03 4:21 UTC (permalink / raw) To: Junio C Hamano; +Cc: brian m. carlson, git, shejialuo, Jeff King Junio C Hamano <gitster@pobox.com> writes: > Thanks, both. Will queue after fixing the proposed log message a > bit (the sample must be indented, especially when it contains lines > that look like a patch). Yes, that looks better. Thanks! >> I suspect we'll also hit this on FreeBSD, which has a similar issue in >> that it returns `EMLINK` instead of `ELOOP`. > > I won't expect Collin or you to redo this patch to cover FreeBSD; > anybody with FreeBSD box/vm can do a separate patch on a different > day. It is no problem. I have access to a FreeBSD 14.2 machine. I can confirm that it fails with EMLINK. My V2 patch fixes the issue on both platforms. From the documentation OpenBSD shouldn't be an issue. But if so it should be trivial to fix. Collin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 4:21 ` Collin Funk @ 2025-05-03 17:32 ` Junio C Hamano 0 siblings, 0 replies; 25+ messages in thread From: Junio C Hamano @ 2025-05-03 17:32 UTC (permalink / raw) To: Collin Funk; +Cc: brian m. carlson, git, shejialuo, Jeff King Collin Funk <collin.funk1@gmail.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> Thanks, both. Will queue after fixing the proposed log message a >> bit (the sample must be indented, especially when it contains lines >> that look like a patch). > > Yes, that looks better. Thanks! > >>> I suspect we'll also hit this on FreeBSD, which has a similar issue in >>> that it returns `EMLINK` instead of `ELOOP`. >> >> I won't expect Collin or you to redo this patch to cover FreeBSD; >> anybody with FreeBSD box/vm can do a separate patch on a different >> day. > > It is no problem. I have access to a FreeBSD 14.2 machine. I can confirm > that it fails with EMLINK. Excellent. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 0:57 ` brian m. carlson 2025-05-03 1:05 ` Junio C Hamano @ 2025-05-03 3:48 ` Collin Funk 2025-05-03 13:31 ` Jeff King 2 siblings, 0 replies; 25+ messages in thread From: Collin Funk @ 2025-05-03 3:48 UTC (permalink / raw) To: brian m. carlson; +Cc: git, shejialuo, Jeff King, Junio C Hamano Hi Brian, "brian m. carlson" <sandals@crustytoothpaste.net> writes: > I suspect we'll also hit this on FreeBSD, which has a similar issue in > that it returns `EMLINK` instead of `ELOOP`. Good memory. I can confirm that FreeBSD fails in the same place with a message for EMLINK. I'll write another patch for that. > I do wish these two OSes would provide an appropriate POSIX-compatible > `open` call when set with `_POSIX_SOURCE`, since this is one of the > biggest portability problems with them. It is documented in their man pages, so I assume it is intentional. But I don't see any benefit to differing from POSIX here. I'll see if I can find any discussion before I submit a bug report. Collin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 0:57 ` brian m. carlson 2025-05-03 1:05 ` Junio C Hamano 2025-05-03 3:48 ` Collin Funk @ 2025-05-03 13:31 ` Jeff King 2025-05-03 14:58 ` shejialuo ` (2 more replies) 2 siblings, 3 replies; 25+ messages in thread From: Jeff King @ 2025-05-03 13:31 UTC (permalink / raw) To: brian m. carlson; +Cc: Collin Funk, git, shejialuo, Junio C Hamano On Sat, May 03, 2025 at 12:57:11AM +0000, brian m. carlson wrote: > > diff --git a/wrapper.c b/wrapper.c > > index 3c79778055..4d448d7c57 100644 > > --- a/wrapper.c > > +++ b/wrapper.c > > @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename) > > int open_nofollow(const char *path, int flags) > > { > > #ifdef O_NOFOLLOW > > - return open(path, flags | O_NOFOLLOW); > > + int ret = open(path, flags | O_NOFOLLOW); > > +#ifdef __NetBSD__ > > + /* > > + * NetBSD sets errno to EFTYPE when path is a symlink. The only other > > + * time this errno occurs when O_REGULAR is used. Since we don't use > > + * it anywhere we can avoid an lstat here. > > + */ > > + if (ret < 0 && errno == EFTYPE) { > > + errno = ELOOP; > > + return -1; > > + } > > +#endif > > + return ret; > > This patch seems reasonable and correct. I don't use NetBSD, but I do > often test there, and I'm aware of this infelicity. I'm surprised we > haven't hit it before. > > I suspect we'll also hit this on FreeBSD, which has a similar issue in > that it returns `EMLINK` instead of `ELOOP`. I do wish these two OSes > would provide an appropriate POSIX-compatible `open` call when set with > `_POSIX_SOURCE`, since this is one of the biggest portability problems > with them. The inconsistency in errno has been there since open_nofollow() was added years ago. But we didn't notice it because in general we try not to be too particular about which errno value we receive. That changed in cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28), which uses open_nofollow() to check for symlinks while we open it. But it feels like it would be more direct to just lstat() the file in the first place (which we end up doing anyway to check for other things besides symlinks!). I.e., I'd think this would just work everywhere: diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 3ad1ed0787..a247220df9 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -2071,35 +2071,32 @@ static int packed_fsck(struct ref_store *ref_store, if (o->verbose) fprintf_ln(stderr, "Checking packed-refs file %s", refs->path); - fd = open_nofollow(refs->path, O_RDONLY); - if (fd < 0) { + if (lstat(refs->path, &st) < 0) { /* * If the packed-refs file doesn't exist, there's nothing * to check. */ if (errno == ENOENT) goto cleanup; - if (errno == ELOOP) { - struct fsck_ref_report report = { 0 }; - report.path = "packed-refs"; - ret = fsck_report_ref(o, &report, - FSCK_MSG_BAD_REF_FILETYPE, - "not a regular file but a symlink"); - goto cleanup; - } - - ret = error_errno(_("unable to open '%s'"), refs->path); - goto cleanup; - } else if (fstat(fd, &st) < 0) { ret = error_errno(_("unable to stat '%s'"), refs->path); goto cleanup; - } else if (!S_ISREG(st.st_mode)) { + } + + if (!S_ISREG(st.st_mode)) { struct fsck_ref_report report = { 0 }; report.path = "packed-refs"; ret = fsck_report_ref(o, &report, FSCK_MSG_BAD_REF_FILETYPE, "not a regular file"); + /* XXX optionally could keep going here and actually + * check the contents we do find */ + goto cleanup; + } + + fd = open(refs->path, O_RDONLY); + if (fd < 0) { + ret = error_errno(_("unable to open '%s'"), refs->path); goto cleanup; } diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh index 9d1dc2144c..34d54a7c05 100755 --- a/t/t0602-reffiles-fsck.sh +++ b/t/t0602-reffiles-fsck.sh @@ -632,7 +632,7 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' ln -sf packed-refs-back .git/packed-refs && test_must_fail git refs verify 2>err && cat >expect <<-EOF && - error: packed-refs: badRefFiletype: not a regular file but a symlink + error: packed-refs: badRefFiletype: not a regular file EOF rm .git/packed-refs && test_cmp expect err && It's not as "atomic" as open_nofollow() and fstat(), but I don't think we care about that for fsck. This is about consistency checking, not trying to beat races against active adversaries (not to mention that our open_nofollow() is best-effort anyway, and may be racy). I dunno. I don't mind making errno returns more consistent to prevent a future foot-gun, but I think as a general rule we may be better off not looking too hard at errno for exotic conditions. -Peff PS I notice that this same function reads the whole packed-refs file into a strbuf. That may be a problem, as they can grow pretty big in extreme cases (e.g., GitHub's fork networks easily got into the gigabytes, as it was every ref of every fork). We usually mmap it. Not related to this discussion, but just something I noticed while reading the function. ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 13:31 ` Jeff King @ 2025-05-03 14:58 ` shejialuo 2025-05-03 15:49 ` Jeff King 2025-05-03 18:56 ` Collin Funk 2025-05-05 15:43 ` Junio C Hamano 2 siblings, 1 reply; 25+ messages in thread From: shejialuo @ 2025-05-03 14:58 UTC (permalink / raw) To: Jeff King; +Cc: brian m. carlson, Collin Funk, git, Junio C Hamano On Sat, May 03, 2025 at 09:31:58AM -0400, Jeff King wrote: > On Sat, May 03, 2025 at 12:57:11AM +0000, brian m. carlson wrote: > > > > diff --git a/wrapper.c b/wrapper.c > > > index 3c79778055..4d448d7c57 100644 > > > --- a/wrapper.c > > > +++ b/wrapper.c > > > @@ -737,7 +737,19 @@ int is_empty_or_missing_file(const char *filename) > > > int open_nofollow(const char *path, int flags) > > > { > > > #ifdef O_NOFOLLOW > > > - return open(path, flags | O_NOFOLLOW); > > > + int ret = open(path, flags | O_NOFOLLOW); > > > +#ifdef __NetBSD__ > > > + /* > > > + * NetBSD sets errno to EFTYPE when path is a symlink. The only other > > > + * time this errno occurs when O_REGULAR is used. Since we don't use > > > + * it anywhere we can avoid an lstat here. > > > + */ > > > + if (ret < 0 && errno == EFTYPE) { > > > + errno = ELOOP; > > > + return -1; > > > + } > > > +#endif > > > + return ret; > > > > This patch seems reasonable and correct. I don't use NetBSD, but I do > > often test there, and I'm aware of this infelicity. I'm surprised we > > haven't hit it before. > > > > I suspect we'll also hit this on FreeBSD, which has a similar issue in > > that it returns `EMLINK` instead of `ELOOP`. I do wish these two OSes > > would provide an appropriate POSIX-compatible `open` call when set with > > `_POSIX_SOURCE`, since this is one of the biggest portability problems > > with them. > > The inconsistency in errno has been there since open_nofollow() was > added years ago. But we didn't notice it because in general we try not > to be too particular about which errno value we receive. > > That changed in cfea2f2da8 (packed-backend: check whether the > "packed-refs" is regular file, 2025-02-28), which uses open_nofollow() > to check for symlinks while we open it. But it feels like it would be > more direct to just lstat() the file in the first place (which we end up > doing anyway to check for other things besides symlinks!). > > I.e., I'd think this would just work everywhere: > > diff --git a/refs/packed-backend.c b/refs/packed-backend.c > index 3ad1ed0787..a247220df9 100644 > --- a/refs/packed-backend.c > +++ b/refs/packed-backend.c > @@ -2071,35 +2071,32 @@ static int packed_fsck(struct ref_store *ref_store, > if (o->verbose) > fprintf_ln(stderr, "Checking packed-refs file %s", refs->path); > > - fd = open_nofollow(refs->path, O_RDONLY); > - if (fd < 0) { > + if (lstat(refs->path, &st) < 0) { > /* > * If the packed-refs file doesn't exist, there's nothing > * to check. > */ > if (errno == ENOENT) > goto cleanup; > > - if (errno == ELOOP) { > - struct fsck_ref_report report = { 0 }; > - report.path = "packed-refs"; > - ret = fsck_report_ref(o, &report, > - FSCK_MSG_BAD_REF_FILETYPE, > - "not a regular file but a symlink"); > - goto cleanup; > - } > - > - ret = error_errno(_("unable to open '%s'"), refs->path); > - goto cleanup; > - } else if (fstat(fd, &st) < 0) { > ret = error_errno(_("unable to stat '%s'"), refs->path); > goto cleanup; > - } else if (!S_ISREG(st.st_mode)) { > + } > + > + if (!S_ISREG(st.st_mode)) { > struct fsck_ref_report report = { 0 }; > report.path = "packed-refs"; > ret = fsck_report_ref(o, &report, > FSCK_MSG_BAD_REF_FILETYPE, > "not a regular file"); > + /* XXX optionally could keep going here and actually > + * check the contents we do find */ > + goto cleanup; > + } > + > + fd = open(refs->path, O_RDONLY); > + if (fd < 0) { > + ret = error_errno(_("unable to open '%s'"), refs->path); > goto cleanup; > } > > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh > index 9d1dc2144c..34d54a7c05 100755 > --- a/t/t0602-reffiles-fsck.sh > +++ b/t/t0602-reffiles-fsck.sh > @@ -632,7 +632,7 @@ test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' > ln -sf packed-refs-back .git/packed-refs && > test_must_fail git refs verify 2>err && > cat >expect <<-EOF && > - error: packed-refs: badRefFiletype: not a regular file but a symlink > + error: packed-refs: badRefFiletype: not a regular file > EOF > rm .git/packed-refs && > test_cmp expect err && > > It's not as "atomic" as open_nofollow() and fstat(), but I don't think > we care about that for fsck. This is about consistency checking, not > trying to beat races against active adversaries (not to mention that our > open_nofollow() is best-effort anyway, and may be racy). > The motivation why I use "open_nofollow" is that we want to avoid race conditions as many as possible. However, as you have said, our "open_nofollow" function still has the race. And it would be a little cumbersome to make "open_nofollow" compatible. So, I rather prefer that we simply use this way to solve the problem. > I dunno. I don't mind making errno returns more consistent to prevent a > future foot-gun, but I think as a general rule we may be better off not > looking too hard at errno for exotic conditions. > I don't mind either. > -Peff > > PS I notice that this same function reads the whole packed-refs file > into a strbuf. That may be a problem, as they can grow pretty big in > extreme cases (e.g., GitHub's fork networks easily got into the > gigabytes, as it was every ref of every fork). We usually mmap it. > Not related to this discussion, but just something I noticed while > reading the function. Peff, thanks for notifying me. I want to know more background. Initially, the reason why I don't use `mmap` is that when checking the ref consistency, we usually don't need to share the "packed-refs" content for multiple processes via `mmap`. I don't know how Github executes "git fsck" for the forked repositories. Is there any regular tasks for "git fsck"? And would "packed-refs" file be shared for all these repositories? If above is the case, I agree that we should reuse the logic of "load_contents" to enhance. But I don't know whether we need to do this in the first place. Thanks, Jialuo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 14:58 ` shejialuo @ 2025-05-03 15:49 ` Jeff King 2025-05-05 6:39 ` Patrick Steinhardt 0 siblings, 1 reply; 25+ messages in thread From: Jeff King @ 2025-05-03 15:49 UTC (permalink / raw) To: shejialuo; +Cc: brian m. carlson, Collin Funk, git, Junio C Hamano On Sat, May 03, 2025 at 10:58:58PM +0800, shejialuo wrote: > > PS I notice that this same function reads the whole packed-refs file > > into a strbuf. That may be a problem, as they can grow pretty big in > > extreme cases (e.g., GitHub's fork networks easily got into the > > gigabytes, as it was every ref of every fork). We usually mmap it. > > Not related to this discussion, but just something I noticed while > > reading the function. > > Peff, thanks for notifying me. I want to know more background. > Initially, the reason why I don't use `mmap` is that when checking the > ref consistency, we usually don't need to share the "packed-refs" > content for multiple processes via `mmap`. You're not sharing with other processes running fsck, but you'd be sharing the memory with all of the other processes using that packed-refs file for normal lookups. But even if it's shared with nobody, reading it all into memory is strictly worse than just mmap (since the data is getting copied into the new allocation). > I don't know how Github executes "git fsck" for the forked repositories. > Is there any regular tasks for "git fsck"? And would "packed-refs" file > be shared for all these repositories? I don't know offhand how often GitHub runs fsck in an automated way these days. Or even how big packed-refs files get, for that matter. The specific case I'm thinking of for GitHub is that each fork network has a master "network.git" repo that stores the objects for all of the forks (which point to it via their objects/info/alternates files). That network.git repo doesn't technically need to have all of the refs all the time, but in practice it wants to know about them for reachability during repacking, etc. So it has something like "refs/remotes/<fork_id>/heads/master", and so on, copying the whole refs/* namespace of each fork. If you look at, say, torvalds/linux, the refs data for a single fork is probably ~30k or so (based on looking at what's in a clone). And there are ~55k forks. So that's around 1.5G. Not a deal-breaker to allocate (keeping in mind they have pretty beefy systems), but enough that mmap is probably better. I'm also sure that's not the worst case. It has a lot of forks but the ref namespace is not that huge compared to some other projects (and it's the product of the two that is the problem). > If above is the case, I agree that we should reuse the logic of > "load_contents" to enhance. But I don't know whether we need to do this > in the first place. I think you can skip the stat validity bits. In theory you can also skip the mmap_strategy stuff, but I guess it might mean that "fsck" could block other writers on Windows temporarily (though we wouldn't plan to hold it open long, the way the normal reader does). The other gotcha is that the result won't be NUL-terminated, but it looks like the helper functions already take an "eof" pointer to avoid looking past the end of what was read. -Peff ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 15:49 ` Jeff King @ 2025-05-05 6:39 ` Patrick Steinhardt 2025-05-05 12:17 ` shejialuo 0 siblings, 1 reply; 25+ messages in thread From: Patrick Steinhardt @ 2025-05-05 6:39 UTC (permalink / raw) To: Jeff King; +Cc: shejialuo, brian m. carlson, Collin Funk, git, Junio C Hamano On Sat, May 03, 2025 at 11:49:28AM -0400, Jeff King wrote: > On Sat, May 03, 2025 at 10:58:58PM +0800, shejialuo wrote: > > > > PS I notice that this same function reads the whole packed-refs file > > > into a strbuf. That may be a problem, as they can grow pretty big in > > > extreme cases (e.g., GitHub's fork networks easily got into the > > > gigabytes, as it was every ref of every fork). We usually mmap it. > > > Not related to this discussion, but just something I noticed while > > > reading the function. > > > > Peff, thanks for notifying me. I want to know more background. > > Initially, the reason why I don't use `mmap` is that when checking the > > ref consistency, we usually don't need to share the "packed-refs" > > content for multiple processes via `mmap`. > > You're not sharing with other processes running fsck, but you'd be > sharing the memory with all of the other processes using that > packed-refs file for normal lookups. > > But even if it's shared with nobody, reading it all into memory is > strictly worse than just mmap (since the data is getting copied into the > new allocation). > > > I don't know how Github executes "git fsck" for the forked repositories. > > Is there any regular tasks for "git fsck"? And would "packed-refs" file > > be shared for all these repositories? > > I don't know offhand how often GitHub runs fsck in an automated way > these days. Or even how big packed-refs files get, for that matter. They typically are at most a couple of megabytes, but there certainly are outliers. For as at GitLab.com, the vast majority (>99%) of such files is less than 50MB and typically even less than 5MB. > The specific case I'm thinking of for GitHub is that each fork network > has a master "network.git" repo that stores the objects for all of the > forks (which point to it via their objects/info/alternates files). That > network.git repo doesn't technically need to have all of the refs all > the time, but in practice it wants to know about them for reachability > during repacking, etc. > > So it has something like "refs/remotes/<fork_id>/heads/master", and so > on, copying the whole refs/* namespace of each fork. If you look at, > say, torvalds/linux, the refs data for a single fork is probably ~30k or > so (based on looking at what's in a clone). And there are ~55k forks. So > that's around 1.5G. Not a deal-breaker to allocate (keeping in mind they > have pretty beefy systems), but enough that mmap is probably better. > > I'm also sure that's not the worst case. It has a lot of forks but the > ref namespace is not that huge compared to some other projects (and it's > the product of the two that is the problem). Yeah, the interesting case is always the outliers. One of the worst offenders we have at GitLab.com is our own "gitlab-org/gitlab" repository. This particular repository has a "packed-refs" file that is around 2GB in size. So I think refactoring this code to use `mmap()` would probably make sense. Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-05 6:39 ` Patrick Steinhardt @ 2025-05-05 12:17 ` shejialuo 0 siblings, 0 replies; 25+ messages in thread From: shejialuo @ 2025-05-05 12:17 UTC (permalink / raw) To: Patrick Steinhardt Cc: Jeff King, brian m. carlson, Collin Funk, git, Junio C Hamano On Mon, May 05, 2025 at 08:39:27AM +0200, Patrick Steinhardt wrote: > On Sat, May 03, 2025 at 11:49:28AM -0400, Jeff King wrote: > > On Sat, May 03, 2025 at 10:58:58PM +0800, shejialuo wrote: > > > > > > PS I notice that this same function reads the whole packed-refs file > > > > into a strbuf. That may be a problem, as they can grow pretty big in > > > > extreme cases (e.g., GitHub's fork networks easily got into the > > > > gigabytes, as it was every ref of every fork). We usually mmap it. > > > > Not related to this discussion, but just something I noticed while > > > > reading the function. > > > > > > Peff, thanks for notifying me. I want to know more background. > > > Initially, the reason why I don't use `mmap` is that when checking the > > > ref consistency, we usually don't need to share the "packed-refs" > > > content for multiple processes via `mmap`. > > > > You're not sharing with other processes running fsck, but you'd be > > sharing the memory with all of the other processes using that > > packed-refs file for normal lookups. > > > > But even if it's shared with nobody, reading it all into memory is > > strictly worse than just mmap (since the data is getting copied into the > > new allocation). > > > > > I don't know how Github executes "git fsck" for the forked repositories. > > > Is there any regular tasks for "git fsck"? And would "packed-refs" file > > > be shared for all these repositories? > > > > I don't know offhand how often GitHub runs fsck in an automated way > > these days. Or even how big packed-refs files get, for that matter. > > They typically are at most a couple of megabytes, but there certainly > are outliers. For as at GitLab.com, the vast majority (>99%) of such > files is less than 50MB and typically even less than 5MB. > > > The specific case I'm thinking of for GitHub is that each fork network > > has a master "network.git" repo that stores the objects for all of the > > forks (which point to it via their objects/info/alternates files). That > > network.git repo doesn't technically need to have all of the refs all > > the time, but in practice it wants to know about them for reachability > > during repacking, etc. > > > > So it has something like "refs/remotes/<fork_id>/heads/master", and so > > on, copying the whole refs/* namespace of each fork. If you look at, > > say, torvalds/linux, the refs data for a single fork is probably ~30k or > > so (based on looking at what's in a clone). And there are ~55k forks. So > > that's around 1.5G. Not a deal-breaker to allocate (keeping in mind they > > have pretty beefy systems), but enough that mmap is probably better. > > > > I'm also sure that's not the worst case. It has a lot of forks but the > > ref namespace is not that huge compared to some other projects (and it's > > the product of the two that is the problem). > > Yeah, the interesting case is always the outliers. One of the worst > offenders we have at GitLab.com is our own "gitlab-org/gitlab" > repository. This particular repository has a "packed-refs" file that is > around 2GB in size. > > So I think refactoring this code to use `mmap()` would probably make > sense. > Thank Peff and Patrick for the information. I will send a patch later. > Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 13:31 ` Jeff King 2025-05-03 14:58 ` shejialuo @ 2025-05-03 18:56 ` Collin Funk 2025-05-05 15:43 ` Junio C Hamano 2 siblings, 0 replies; 25+ messages in thread From: Collin Funk @ 2025-05-03 18:56 UTC (permalink / raw) To: Jeff King; +Cc: brian m. carlson, git, shejialuo, Junio C Hamano Jeff King <peff@peff.net> writes: > I dunno. I don't mind making errno returns more consistent to prevent a > future foot-gun, but I think as a general rule we may be better off not > looking too hard at errno for exotic conditions. I generally agree. But in this case FreeBSD only sets errno to EMLINK in this specific case and the only other case NetBSD sets errno to EFTYPE is when the O_REGULAR flag is used and the path is not a regular file. Using 'grep -r O_REGULAR' confirms it is never used in git, and since it is a NetBSD extension I doubt it will ever be used. So not an exotic case, in my opinion. Thanks, Collin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-03 13:31 ` Jeff King 2025-05-03 14:58 ` shejialuo 2025-05-03 18:56 ` Collin Funk @ 2025-05-05 15:43 ` Junio C Hamano 2025-05-05 18:03 ` Jeff King 2 siblings, 1 reply; 25+ messages in thread From: Junio C Hamano @ 2025-05-05 15:43 UTC (permalink / raw) To: Jeff King; +Cc: brian m. carlson, Collin Funk, git, shejialuo Jeff King <peff@peff.net> writes: > That changed in cfea2f2da8 (packed-backend: check whether the > "packed-refs" is regular file, 2025-02-28), which uses open_nofollow() > to check for symlinks while we open it. But it feels like it would be > more direct to just lstat() the file in the first place (which we end up > doing anyway to check for other things besides symlinks!). > ... > It's not as "atomic" as open_nofollow() and fstat(), but I don't think > we care about that for fsck. This is about consistency checking, not > trying to beat races against active adversaries (not to mention that our > open_nofollow() is best-effort anyway, and may be racy). True. Atomicity, which the use of open_nofollow() and fstat() tries to achieve, may not matter in fsck. We can think of the use of open_nofollow() in this particular codepath merely as a convenient helper function, and I do not think we have any problem with such a helper function. But open_nofollow() and its emulation can be called from other codepaths that may care about atomicity, and I am not sure what our attitude towards atomicity requirements vs platform capabilities should be. If an atomicity (or any other) requirement in a particular codepath has a simple and obvious way to solve on common platforms, but that the mechanism to implement the simple and obvious way is unavailable on other platforms, where does it lead us? For some kind of requirements, we can treat it merely as a quality of implementation issue, similar to how finalize_object_file() ideally wants to do the create(TMP) then link(TMP->FINAL) then unlink(TMP) dance (because we want to detect collisions when able) but has fallback implementation to create(TMP) then rename(TMP->FINAL) (which punts on collision detection) on platforms where the preferred way does not work. It falls into this category, I would think, to think of use of open_nofollow() in this codepath as a mere helper function that makes the code in fsck shorter. But for other kind of requirements, we want to fulfill them on all platforms that we claim to support. Using open_nofollow() to achieve hard atomicity requirement would be a bug in such a situation. Should we somehow warn our developers against its use? Idealists in us first try hard to find the right abstraction that would work everywhere, and use compat/ layer to implement that abstraction, but we of course are often not successful, and end up with a series of #ifdef for pieces of platform-specific code in fairly higher layer. It feels that open_nofollow() that is not necesarily atomic is the latter but that is done at a level that is a bit too low. I dunno. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-05 15:43 ` Junio C Hamano @ 2025-05-05 18:03 ` Jeff King 2025-05-06 13:43 ` shejialuo 2025-05-06 22:58 ` Junio C Hamano 0 siblings, 2 replies; 25+ messages in thread From: Jeff King @ 2025-05-05 18:03 UTC (permalink / raw) To: Junio C Hamano; +Cc: brian m. carlson, Collin Funk, git, shejialuo On Mon, May 05, 2025 at 08:43:18AM -0700, Junio C Hamano wrote: > But for other kind of requirements, we want to fulfill them on all > platforms that we claim to support. Using open_nofollow() to > achieve hard atomicity requirement would be a bug in such a > situation. Should we somehow warn our developers against its use? The comment above the declaration says: /* * Open with O_NOFOLLOW, or equivalent. Note that the fallback equivalent * may be racy. Do not use this as protection against an attacker who can * simultaneously create paths. */ int open_nofollow(const char *path, int flags); though that may not be enough. 00611d8440 (add open_nofollow() helper, 2021-02-16) discusses a way that it could be made less racy, at a slightly increased cost. IMHO that is somewhat orthogonal to the issue here, though, which is purely about the case where O_NOFOLLOW does exist (ironically, our racy fallback code consistently returns ELOOP ;) ). The issue at hand is that particular errno responses are not always portable. The patch discussed here improves that. My point was more that I'm not sure to what degree we should care about errno consistency in our wrappers (which is inherently a bit whack-a-mole as we find new cases), versus trying not to care too hard about specific errno values in calling code. I can see arguments either way (and as I said, an argument for making errno values consistent even if we try to rely on them less). Mostly I was just a little surprised to see open_nofollow() being used in this way (especially since we have to end up stat()-ing anyway to check for other cases). -Peff ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-05 18:03 ` Jeff King @ 2025-05-06 13:43 ` shejialuo 2025-05-06 22:58 ` Junio C Hamano 1 sibling, 0 replies; 25+ messages in thread From: shejialuo @ 2025-05-06 13:43 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, brian m. carlson, Collin Funk, git On Mon, May 05, 2025 at 02:03:11PM -0400, Jeff King wrote: > On Mon, May 05, 2025 at 08:43:18AM -0700, Junio C Hamano wrote: > > > But for other kind of requirements, we want to fulfill them on all > > platforms that we claim to support. Using open_nofollow() to > > achieve hard atomicity requirement would be a bug in such a > > situation. Should we somehow warn our developers against its use? > > The comment above the declaration says: > > /* > * Open with O_NOFOLLOW, or equivalent. Note that the fallback equivalent > * may be racy. Do not use this as protection against an attacker who can > * simultaneously create paths. > */ > int open_nofollow(const char *path, int flags); > > though that may not be enough. 00611d8440 (add open_nofollow() helper, > 2021-02-16) discusses a way that it could be made less racy, at a > slightly increased cost. > > IMHO that is somewhat orthogonal to the issue here, though, which is > purely about the case where O_NOFOLLOW does exist (ironically, our > racy fallback code consistently returns ELOOP ;) ). > > The issue at hand is that particular errno responses are not always > portable. The patch discussed here improves that. My point was more that > I'm not sure to what degree we should care about errno consistency in > our wrappers (which is inherently a bit whack-a-mole as we find new > cases), versus trying not to care too hard about specific errno values > in calling code. > > I can see arguments either way (and as I said, an argument for making > errno values consistent even if we try to rely on them less). Mostly I > was just a little surprised to see open_nofollow() being used in this > way (especially since we have to end up stat()-ing anyway to check for > other cases). > IIRC, we wanted to try our best to make our code consistent. In the very early implementation, I actually firstly checked the file type and then opened the file. However, there is a chance that the raw "packed-refs" file could be converted to symlink between checking the filetype and opening the file to get the fd. Although, in fsck, we may just ignore this. But during the review, I found out that using "open_nofollow" could avoid race in some platforms. Sadly, I haven't realized that this would break compatibility ;) Because using "open_nofollow" could only check whether the filetype is the symlink, we also need to use "stat" again to check whether the filetype is OK. I agree that it is a little redundant. Since the patch from Collin would solve the problem. I won't change the logic. I'll focus on using `mmap` to open the "packed-refs" file. > -Peff Thanks, Jialuo ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] wrapper: Fix a errno discrepancy on NetBSD. 2025-05-05 18:03 ` Jeff King 2025-05-06 13:43 ` shejialuo @ 2025-05-06 22:58 ` Junio C Hamano 1 sibling, 0 replies; 25+ messages in thread From: Junio C Hamano @ 2025-05-06 22:58 UTC (permalink / raw) To: Jeff King; +Cc: brian m. carlson, Collin Funk, git, shejialuo Jeff King <peff@peff.net> writes: > On Mon, May 05, 2025 at 08:43:18AM -0700, Junio C Hamano wrote: > >> But for other kind of requirements, we want to fulfill them on all >> platforms that we claim to support. Using open_nofollow() to >> achieve hard atomicity requirement would be a bug in such a >> situation. Should we somehow warn our developers against its use? > > The comment above the declaration says: > > /* > * Open with O_NOFOLLOW, or equivalent. Note that the fallback equivalent > * may be racy. Do not use this as protection against an attacker who can > * simultaneously create paths. > */ > int open_nofollow(const char *path, int flags); > > though that may not be enough. 00611d8440 (add open_nofollow() helper, > 2021-02-16) discusses a way that it could be made less racy, at a > slightly increased cost. > > IMHO that is somewhat orthogonal to the issue here, though, which is > purely about the case where O_NOFOLLOW does exist (ironically, our > racy fallback code consistently returns ELOOP ;) ). Yup. And with the above comment, I would say that my worries above are unfounded. > The issue at hand is that particular errno responses are not always > portable. The patch discussed here improves that. My point was more that > I'm not sure to what degree we should care about errno consistency in > our wrappers (which is inherently a bit whack-a-mole as we find new > cases), versus trying not to care too hard about specific errno values > in calling code. Yeah, it does give me a bad aftertaste having to pretend (by adding compat code to translate as needed) that all systems share the same set of errno, but we live in not-so-ideal world, so I am afraild that it cannot be avoided. > I can see arguments either way (and as I said, an argument for making > errno values consistent even if we try to rely on them less). Mostly I > was just a little surprised to see open_nofollow() being used in this > way (especially since we have to end up stat()-ing anyway to check for > other cases). That is true. The callers in attr.c and dir.c do want to fstat() after they open, but they are more interested in atomicity (with "the best effort on less capable systems" attitude). The one in mailmap.c doesn't do any stat(), but again it is more about atomisity with the same attitude, I think. ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-02 23:33 [PATCH] wrapper: Fix a errno discrepancy on NetBSD Collin Funk 2025-05-03 0:57 ` brian m. carlson @ 2025-05-03 4:16 ` Collin Funk 2025-05-03 15:45 ` brian m. carlson ` (2 more replies) 1 sibling, 3 replies; 25+ messages in thread From: Collin Funk @ 2025-05-03 4:16 UTC (permalink / raw) To: git; +Cc: shejialuo, sandals, Collin Funk, Jeff King, Junio C Hamano As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a symlink returns -1 and sets errno to EFTYPE which differs from POSIX. This patch fixes the following test failure: $ sh t0602-reffiles-fsck.sh --verbose --- expect 2025-05-02 23:05:23.920890147 +0000 +++ err 2025-05-02 23:05:23.916794959 +0000 @@ -1 +1 @@ -error: packed-refs: badRefFiletype: not a regular file but a symlink +error: unable to open '.git/packed-refs': Inappropriate file type or format not ok 12 - the filetype of packed-refs should be checked FreeBSD has the same issue for EMLINK instead of EFTYPE. This portability issue was introduced in cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28) Signed-off-by: Collin Funk <collin.funk1@gmail.com> --- wrapper.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/wrapper.c b/wrapper.c index 3c79778055..f74e3f7747 100644 --- a/wrapper.c +++ b/wrapper.c @@ -737,7 +737,26 @@ int is_empty_or_missing_file(const char *filename) int open_nofollow(const char *path, int flags) { #ifdef O_NOFOLLOW - return open(path, flags | O_NOFOLLOW); + int ret = open(path, flags | O_NOFOLLOW); + /* + * NetBSD sets errno to EFTYPE when path is a symlink. The only other + * time this errno occurs when O_REGULAR is used. Since we don't use + * it anywhere we can avoid an lstat here. FreeBSD does the same with + * EMLINK. + */ +#ifdef __NetBSD__ +#define SYMLINK_ERRNO EFTYPE +#elif defined(__FreeBSD__) +#define SYMLINK_ERRNO EMLINK +#endif +#if SYMLINK_ERRNO + if (ret < 0 && errno == SYMLINK_ERRNO) { + errno = ELOOP; + return -1; + } +#undef SYMLINK_ERRNO +#endif + return ret; #else struct stat st; if (lstat(path, &st) < 0) -- 2.49.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk @ 2025-05-03 15:45 ` brian m. carlson 2025-05-03 18:44 ` Collin Funk 2025-05-05 6:43 ` Patrick Steinhardt 2025-05-06 1:08 ` [PATCH v3] " Collin Funk 2 siblings, 1 reply; 25+ messages in thread From: brian m. carlson @ 2025-05-03 15:45 UTC (permalink / raw) To: Collin Funk; +Cc: git, shejialuo, Jeff King, Junio C Hamano [-- Attachment #1: Type: text/plain, Size: 1378 bytes --] On 2025-05-03 at 04:16:51, Collin Funk wrote: > As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a > symlink returns -1 and sets errno to EFTYPE which differs from POSIX. > > This patch fixes the following test failure: > > $ sh t0602-reffiles-fsck.sh --verbose > --- expect 2025-05-02 23:05:23.920890147 +0000 > +++ err 2025-05-02 23:05:23.916794959 +0000 > @@ -1 +1 @@ > -error: packed-refs: badRefFiletype: not a regular file but a symlink > +error: unable to open '.git/packed-refs': Inappropriate file type or format > not ok 12 - the filetype of packed-refs should be checked > > FreeBSD has the same issue for EMLINK instead of EFTYPE. > > This portability issue was introduced in cfea2f2da8 (packed-backend: > check whether the "packed-refs" is regular file, 2025-02-28) Yup, this looks good. Thanks again for the patch. I'll just add one resource for people who might like to look into these kinds of things more. https://man.freebsd.org/cgi/man.cgi is the FreeBSD man page viewer, which lets you view manual pages from the BSDs, Linux, and some proprietary Unix systems. It can be quite helpful for finding and fixing portability issues like this or just seeing what command-line options or arguments a certain Unix supports. -- brian m. carlson (they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 325 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-03 15:45 ` brian m. carlson @ 2025-05-03 18:44 ` Collin Funk 0 siblings, 0 replies; 25+ messages in thread From: Collin Funk @ 2025-05-03 18:44 UTC (permalink / raw) To: brian m. carlson; +Cc: git, shejialuo, Jeff King, Junio C Hamano "brian m. carlson" <sandals@crustytoothpaste.net> writes: > I'll just add one resource for people who might like to look into these > kinds of things more. https://man.freebsd.org/cgi/man.cgi is the > FreeBSD man page viewer, which lets you view manual pages from the BSDs, > Linux, and some proprietary Unix systems. It can be quite helpful for > finding and fixing portability issues like this or just seeing what > command-line options or arguments a certain Unix supports. Gnulib documents most portability quirks too [1]. For example, it had the FreeBSD EMLINK and NetBSD EFTYPE with 'open("symlink", O_NOFOLLOW ...) documented, but for some very old versions released around 2014. Now that I have confirmed it still exists from this git test I have updated the documentation there [2]. Thanks, Collin [1] https://www.gnu.org/software/gnulib/manual/ [2] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=c0c646e29fbda0a6eadd6012d8ed1eb33b6c3968 ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk 2025-05-03 15:45 ` brian m. carlson @ 2025-05-05 6:43 ` Patrick Steinhardt 2025-05-05 20:41 ` Junio C Hamano 2025-05-06 1:08 ` [PATCH v3] " Collin Funk 2 siblings, 1 reply; 25+ messages in thread From: Patrick Steinhardt @ 2025-05-05 6:43 UTC (permalink / raw) To: Collin Funk; +Cc: git, shejialuo, sandals, Jeff King, Junio C Hamano On Fri, May 02, 2025 at 09:16:51PM -0700, Collin Funk wrote: > As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a > symlink returns -1 and sets errno to EFTYPE which differs from POSIX. > > This patch fixes the following test failure: > > $ sh t0602-reffiles-fsck.sh --verbose > --- expect 2025-05-02 23:05:23.920890147 +0000 > +++ err 2025-05-02 23:05:23.916794959 +0000 > @@ -1 +1 @@ > -error: packed-refs: badRefFiletype: not a regular file but a symlink > +error: unable to open '.git/packed-refs': Inappropriate file type or format > not ok 12 - the filetype of packed-refs should be checked > > FreeBSD has the same issue for EMLINK instead of EFTYPE. > > This portability issue was introduced in cfea2f2da8 (packed-backend: > check whether the "packed-refs" is regular file, 2025-02-28) Ok, makes sense. > diff --git a/wrapper.c b/wrapper.c > index 3c79778055..f74e3f7747 100644 > --- a/wrapper.c > +++ b/wrapper.c > @@ -737,7 +737,26 @@ int is_empty_or_missing_file(const char *filename) > int open_nofollow(const char *path, int flags) > { > #ifdef O_NOFOLLOW > - return open(path, flags | O_NOFOLLOW); > + int ret = open(path, flags | O_NOFOLLOW); > + /* > + * NetBSD sets errno to EFTYPE when path is a symlink. The only other > + * time this errno occurs when O_REGULAR is used. Since we don't use > + * it anywhere we can avoid an lstat here. FreeBSD does the same with > + * EMLINK. > + */ > +#ifdef __NetBSD__ > +#define SYMLINK_ERRNO EFTYPE > +#elif defined(__FreeBSD__) > +#define SYMLINK_ERRNO EMLINK > +#endif Nit, to make this a bit easier to read: our style guide says that nested preprocessor directives should be indented by one spaces. So this would become: # ifdef __NetBSD__ # define SYMLINK_ERRNO EFTYPE # elif defined(__FreeBSD__) # define SYMLINK_ERRNO EMLINK # endif Note that the `ifdef` itself would also be indented because we already have a surrounding `#ifdef O_NOFOLLOW`. > +#if SYMLINK_ERRNO > + if (ret < 0 && errno == SYMLINK_ERRNO) { > + errno = ELOOP; > + return -1; > + } > +#undef SYMLINK_ERRNO > +#endif These three preprocessor defines should be indented, as well. Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-05 6:43 ` Patrick Steinhardt @ 2025-05-05 20:41 ` Junio C Hamano 2025-05-06 1:16 ` Collin Funk 0 siblings, 1 reply; 25+ messages in thread From: Junio C Hamano @ 2025-05-05 20:41 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: Collin Funk, git, shejialuo, sandals, Jeff King Patrick Steinhardt <ps@pks.im> writes: >> +#ifdef __NetBSD__ >> +#define SYMLINK_ERRNO EFTYPE >> +#elif defined(__FreeBSD__) >> +#define SYMLINK_ERRNO EMLINK >> +#endif > > Nit, to make this a bit easier to read: our style guide says that nested > preprocessor directives should be indented by one spaces. So this would > become: > > # ifdef __NetBSD__ > # define SYMLINK_ERRNO EFTYPE > # elif defined(__FreeBSD__) > # define SYMLINK_ERRNO EMLINK > # endif > > Note that the `ifdef` itself would also be indented because we already > have a surrounding `#ifdef O_NOFOLLOW`. Hmph, it does look easier to read. I think we used to have some outlier files that indented CPP directives by prefixing spaces in front of the whole line, but these days we standardized to express the indentation by inserting spaces immediately after '#' that always sit at the beginning of line, so what you showed here is a good example to mimic. Thanks. > >> +#if SYMLINK_ERRNO >> + if (ret < 0 && errno == SYMLINK_ERRNO) { >> + errno = ELOOP; >> + return -1; >> + } >> +#undef SYMLINK_ERRNO >> +#endif > > These three preprocessor defines should be indented, as well. > > Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-05 20:41 ` Junio C Hamano @ 2025-05-06 1:16 ` Collin Funk 2025-05-06 13:23 ` Patrick Steinhardt 0 siblings, 1 reply; 25+ messages in thread From: Collin Funk @ 2025-05-06 1:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: Patrick Steinhardt, git, shejialuo, sandals, Jeff King Hi all, Junio C Hamano <gitster@pobox.com> writes: >> Nit, to make this a bit easier to read: our style guide says that nested >> preprocessor directives should be indented by one spaces. So this would >> become: >> >> # ifdef __NetBSD__ >> # define SYMLINK_ERRNO EFTYPE >> # elif defined(__FreeBSD__) >> # define SYMLINK_ERRNO EMLINK >> # endif >> >> Note that the `ifdef` itself would also be indented because we already >> have a surrounding `#ifdef O_NOFOLLOW`. > > Hmph, it does look easier to read. I think we used to have some > outlier files that indented CPP directives by prefixing spaces in > front of the whole line, but these days we standardized to express > the indentation by inserting spaces immediately after '#' that > always sit at the beginning of line, so what you showed here is a > good example to mimic. No problem, I sent V3 with the suggested changes. That is actually my preferred why of indenting preprocessor directives. But I saw a mix if CPP indenting, so I was unsure what was correct. I guess I could have looked harder for a style guide, but at least hopefully I followed 'SubmittingPatches' mostly correct. :) Collin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-06 1:16 ` Collin Funk @ 2025-05-06 13:23 ` Patrick Steinhardt 0 siblings, 0 replies; 25+ messages in thread From: Patrick Steinhardt @ 2025-05-06 13:23 UTC (permalink / raw) To: Collin Funk; +Cc: Junio C Hamano, git, shejialuo, sandals, Jeff King On Mon, May 05, 2025 at 06:16:03PM -0700, Collin Funk wrote: > Hi all, > > Junio C Hamano <gitster@pobox.com> writes: > > >> Nit, to make this a bit easier to read: our style guide says that nested > >> preprocessor directives should be indented by one spaces. So this would > >> become: > >> > >> # ifdef __NetBSD__ > >> # define SYMLINK_ERRNO EFTYPE > >> # elif defined(__FreeBSD__) > >> # define SYMLINK_ERRNO EMLINK > >> # endif > >> > >> Note that the `ifdef` itself would also be indented because we already > >> have a surrounding `#ifdef O_NOFOLLOW`. > > > > Hmph, it does look easier to read. I think we used to have some > > outlier files that indented CPP directives by prefixing spaces in > > front of the whole line, but these days we standardized to express > > the indentation by inserting spaces immediately after '#' that > > always sit at the beginning of line, so what you showed here is a > > good example to mimic. > > No problem, I sent V3 with the suggested changes. That is actually my > preferred why of indenting preprocessor directives. But I saw a mix if > CPP indenting, so I was unsure what was correct. I guess I could have > looked harder for a style guide, but at least hopefully I followed > 'SubmittingPatches' mostly correct. :) Yeah, the rule was only introduced rather recently in 7df3f55b92e (Documentation: clarify indentation style for C preprocessor directives, 2024-07-30), so we're still wildly inconsistent. Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v3] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk 2025-05-03 15:45 ` brian m. carlson 2025-05-05 6:43 ` Patrick Steinhardt @ 2025-05-06 1:08 ` Collin Funk 2025-05-06 13:24 ` Patrick Steinhardt 2 siblings, 1 reply; 25+ messages in thread From: Collin Funk @ 2025-05-06 1:08 UTC (permalink / raw) To: git Cc: shejialuo, sandals, Jeff King, Junio C Hamano, Patrick Steinhardt, Collin Funk As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a symlink returns -1 and sets errno to EFTYPE which differs from POSIX. This patch fixes the following test failure: $ sh t0602-reffiles-fsck.sh --verbose --- expect 2025-05-02 23:05:23.920890147 +0000 +++ err 2025-05-02 23:05:23.916794959 +0000 @@ -1 +1 @@ -error: packed-refs: badRefFiletype: not a regular file but a symlink +error: unable to open '.git/packed-refs': Inappropriate file type or format not ok 12 - the filetype of packed-refs should be checked FreeBSD has the same issue for EMLINK instead of EFTYPE. This portability issue was introduced in cfea2f2da8 (packed-backend: check whether the "packed-refs" is regular file, 2025-02-28) Signed-off-by: Collin Funk <collin.funk1@gmail.com> --- wrapper.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/wrapper.c b/wrapper.c index 8b98593149..38fce5327a 100644 --- a/wrapper.c +++ b/wrapper.c @@ -737,7 +737,26 @@ int is_empty_or_missing_file(const char *filename) int open_nofollow(const char *path, int flags) { #ifdef O_NOFOLLOW - return open(path, flags | O_NOFOLLOW); + int ret = open(path, flags | O_NOFOLLOW); + /* + * NetBSD sets errno to EFTYPE when path is a symlink. The only other + * time this errno occurs when O_REGULAR is used. Since we don't use + * it anywhere we can avoid an lstat here. FreeBSD does the same with + * EMLINK. + */ +# ifdef __NetBSD__ +# define SYMLINK_ERRNO EFTYPE +# elif defined(__FreeBSD__) +# define SYMLINK_ERRNO EMLINK +# endif +# if SYMLINK_ERRNO + if (ret < 0 && errno == SYMLINK_ERRNO) { + errno = ELOOP; + return -1; + } +# undef SYMLINK_ERRNO +# endif + return ret; #else struct stat st; if (lstat(path, &st) < 0) -- 2.49.0 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH v3] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP 2025-05-06 1:08 ` [PATCH v3] " Collin Funk @ 2025-05-06 13:24 ` Patrick Steinhardt 0 siblings, 0 replies; 25+ messages in thread From: Patrick Steinhardt @ 2025-05-06 13:24 UTC (permalink / raw) To: Collin Funk; +Cc: git, shejialuo, sandals, Jeff King, Junio C Hamano On Mon, May 05, 2025 at 06:08:59PM -0700, Collin Funk wrote: > As documented on NetBSD's man page, open with the O_NOFOLLOW flag and a > symlink returns -1 and sets errno to EFTYPE which differs from POSIX. > > This patch fixes the following test failure: > > $ sh t0602-reffiles-fsck.sh --verbose > --- expect 2025-05-02 23:05:23.920890147 +0000 > +++ err 2025-05-02 23:05:23.916794959 +0000 > @@ -1 +1 @@ > -error: packed-refs: badRefFiletype: not a regular file but a symlink > +error: unable to open '.git/packed-refs': Inappropriate file type or format > not ok 12 - the filetype of packed-refs should be checked > > FreeBSD has the same issue for EMLINK instead of EFTYPE. > > This portability issue was introduced in cfea2f2da8 (packed-backend: > check whether the "packed-refs" is regular file, 2025-02-28) Thanks, this version addresses my nit. Patrick ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2025-05-06 22:58 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-02 23:33 [PATCH] wrapper: Fix a errno discrepancy on NetBSD Collin Funk 2025-05-03 0:57 ` brian m. carlson 2025-05-03 1:05 ` Junio C Hamano 2025-05-03 4:21 ` Collin Funk 2025-05-03 17:32 ` Junio C Hamano 2025-05-03 3:48 ` Collin Funk 2025-05-03 13:31 ` Jeff King 2025-05-03 14:58 ` shejialuo 2025-05-03 15:49 ` Jeff King 2025-05-05 6:39 ` Patrick Steinhardt 2025-05-05 12:17 ` shejialuo 2025-05-03 18:56 ` Collin Funk 2025-05-05 15:43 ` Junio C Hamano 2025-05-05 18:03 ` Jeff King 2025-05-06 13:43 ` shejialuo 2025-05-06 22:58 ` Junio C Hamano 2025-05-03 4:16 ` [PATCH v2] wrapper: NetBSD gives EFTYPE and FreeBSD gives EMFILE where POSIX uses ELOOP Collin Funk 2025-05-03 15:45 ` brian m. carlson 2025-05-03 18:44 ` Collin Funk 2025-05-05 6:43 ` Patrick Steinhardt 2025-05-05 20:41 ` Junio C Hamano 2025-05-06 1:16 ` Collin Funk 2025-05-06 13:23 ` Patrick Steinhardt 2025-05-06 1:08 ` [PATCH v3] " Collin Funk 2025-05-06 13:24 ` Patrick Steinhardt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).