From mboxrd@z Thu Jan 1 00:00:00 1970 From: Florian Weimer Subject: Re: [PATCH v3 0/5] Add a new fchmodat4() syscall Date: Tue, 11 Jul 2023 14:24:51 +0200 Message-ID: <87lefmbppo.fsf@oldenburg.str.redhat.com> References: <87o8pscpny.fsf@oldenburg2.str.redhat.com> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689078310; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=xjRw+jQ/3UZ/6JfODln6k9FlUepnlRP0EJXqY57Twrc=; b=Oz7OEKID9EKDZu0wLi4EaVcmE7aHJ6BnjqH4MdJXbqZ9OzfaD6L+w6h998TZRjkf6g4ykR z1GpyRkiR3MpffqR8zPyd2MqRmliBBhkeUfIg82cAX29ICHg0OoY3dz3d2r3If98rGeZ3X rtnZpQE/Zx9AHH0rpCbvnWAguSt5A9w= In-Reply-To: (Alexey Gladkov's message of "Tue, 11 Jul 2023 13:25:41 +0200") List-ID: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Alexey Gladkov Cc: LKML , Arnd Bergmann , linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, James.Bottomley@HansenPartnership.com, acme@kernel.org, alexander.shishkin@linux.intel.com, axboe@kernel.dk, benh@kernel.crashing.org, borntraeger@de.ibm.com, bp@alien8.de, catalin.marinas@arm.com, christian@brauner.io, dalias@libc.org, davem@davemloft.net, deepa.kernel@gmail.com, deller@gmx.de, dhowells@redhat.com, fenghua.yu@intel.com, geert@linux-m68k.org, glebfm@altlinux.org, gor@linux.ibm.com, hare@suse.com, hpa@zytor.com, ink@jurassic.park.msu.ru, jhogan@kernel.org, kim.phillips@arm.com, ldv@altlinux.org, linux-alpha@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, l * Alexey Gladkov: > This patch set adds fchmodat4(), a new syscall. The actual > implementation is super simple: essentially it's just the same as > fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags. > I've attempted to make this match "man 2 fchmodat" as closely as > possible, which says EINVAL is returned for invalid flags (as opposed to > ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW). > I have a sketch of a glibc patch that I haven't even compiled yet, but > seems fairly straight-forward: > > diff --git a/sysdeps/unix/sysv/linux/fchmodat.c b/sysdeps/unix/sysv/linux/fchmodat.c > index 6d9cbc1ce9e0..b1beab76d56c 100644 > --- a/sysdeps/unix/sysv/linux/fchmodat.c > +++ b/sysdeps/unix/sysv/linux/fchmodat.c > @@ -29,12 +29,36 @@ > int > fchmodat (int fd, const char *file, mode_t mode, int flag) > { > - if (flag & ~AT_SYMLINK_NOFOLLOW) > - return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL); > -#ifndef __NR_lchmod /* Linux so far has no lchmod syscall. */ > + /* There are four paths through this code: > + - The flags are zero. In this case it's fine to call fchmodat. > + - The flags are non-zero and glibc doesn't have access to > + __NR_fchmodat4. In this case all we can do is emulate the error codes > + defined by the glibc interface from userspace. > + - The flags are non-zero, glibc has __NR_fchmodat4, and the kernel has > + fchmodat4. This is the simplest case, as the fchmodat4 syscall exactly > + matches glibc's library interface so it can be called directly. > + - The flags are non-zero, glibc has __NR_fchmodat4, but the kernel does If you define __NR_fchmodat4 on all architectures, we can use these constants directly in glibc. We no longer depend on the UAPI definitions of those constants, to cut down the number of code variants, and to make glibc's system call profile independent of the kernel header version at build time. Your version is based on 2.31, more recent versions have some reasonable emulation for fchmodat based on /proc/self/fd. I even wrote a comment describing the same buggy behavior that you witnessed: + /* Some Linux versions with some file systems can actually + change symbolic link permissions via /proc, but this is not + intentional, and it gives inconsistent results (e.g., error + return despite mode change). The expected behavior is that + symbolic link modes cannot be changed at all, and this check + enforces that. */ + if (S_ISLNK (st.st_mode)) + { + __close_nocancel (pathfd); + __set_errno (EOPNOTSUPP); + return -1; + } I think there was some kernel discussion about that behavior before, but apparently, it hasn't led to fixes. I wonder if it makes sense to add a similar error return to the system call implementation? > + not. In this case we must respect the error codes defined by the glibc > + interface instead of returning ENOSYS. > + The intent here is to ensure that the kernel is called at most once per > + library call, and that the error types defined by glibc are always > + respected. */ > + > +#ifdef __NR_fchmodat4 > + long result; > +#endif > + > + if (flag == 0) > + return INLINE_SYSCALL (fchmodat, 3, fd, file, mode); > + > +#ifdef __NR_fchmodat4 > + result = INLINE_SYSCALL (fchmodat4, 4, fd, file, mode, flag); > + if (result == 0 || errno != ENOSYS) > + return result; > +#endif The last if condition is the recommended approach, but in the past, it broke container host compatibility pretty badly due to seccomp filters that return EPERM instead of ENOSYS. I guess we'll learn soon enough if that's been fixed by now. 8-P Thanks, Florian From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5521EB64DD for ; Tue, 11 Jul 2023 12:37:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232137AbjGKMhf (ORCPT ); Tue, 11 Jul 2023 08:37:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231278AbjGKMhe (ORCPT ); Tue, 11 Jul 2023 08:37:34 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6D991981 for ; Tue, 11 Jul 2023 05:36:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689078930; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=xjRw+jQ/3UZ/6JfODln6k9FlUepnlRP0EJXqY57Twrc=; b=QdL2+3iWQ79honkTi0uSdyo5csJN/qt1YJcW8g38hMK6IAyVLuWL5BK41FdIWgd6/1xhdi Aixh2TpK8fZpykmknGLE3EupMYh3i3EcfL4gqsbQtSVMLBF0YhF526osiLO8VgILd1eF4U l/UcCRzJrVIYZfPG/NmhCcmSNzq5feA= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-262-7TmWQ9O4NwC7vHZLlcVKgA-1; Tue, 11 Jul 2023 08:25:06 -0400 X-MC-Unique: 7TmWQ9O4NwC7vHZLlcVKgA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5E5E628EC108; Tue, 11 Jul 2023 12:25:04 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.2.16.46]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 96DBB200AD6E; Tue, 11 Jul 2023 12:24:53 +0000 (UTC) From: Florian Weimer To: Alexey Gladkov Cc: LKML , Arnd Bergmann , linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk, James.Bottomley@HansenPartnership.com, acme@kernel.org, alexander.shishkin@linux.intel.com, axboe@kernel.dk, benh@kernel.crashing.org, borntraeger@de.ibm.com, bp@alien8.de, catalin.marinas@arm.com, christian@brauner.io, dalias@libc.org, davem@davemloft.net, deepa.kernel@gmail.com, deller@gmx.de, dhowells@redhat.com, fenghua.yu@intel.com, geert@linux-m68k.org, glebfm@altlinux.org, gor@linux.ibm.com, hare@suse.com, hpa@zytor.com, ink@jurassic.park.msu.ru, jhogan@kernel.org, kim.phillips@arm.com, ldv@altlinux.org, linux-alpha@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux@armlinux.org.uk, linuxppc-dev@lists.ozlabs.org, luto@kernel.org, mattst88@gmail.com, mingo@redhat.com, monstr@monstr.eu, mpe@ellerman.id.au, namhyung@kernel.org, paul.burton@mips.com, paulus@samba.org, peterz@infradead.org, ralf@linux-mips.org, rth@twiddle.net, sparclinux@vger.kernel.org, stefan@agner.ch, tglx@linutronix.de, tony.luck@intel.com, tycho@tycho.ws, will@kernel.org, x86@kernel.org, ysato@users.sourceforge.jp Subject: Re: [PATCH v3 0/5] Add a new fchmodat4() syscall References: <87o8pscpny.fsf@oldenburg2.str.redhat.com> Date: Tue, 11 Jul 2023 14:24:51 +0200 In-Reply-To: (Alexey Gladkov's message of "Tue, 11 Jul 2023 13:25:41 +0200") Message-ID: <87lefmbppo.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: linux-api@vger.kernel.org * Alexey Gladkov: > This patch set adds fchmodat4(), a new syscall. The actual > implementation is super simple: essentially it's just the same as > fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags. > I've attempted to make this match "man 2 fchmodat" as closely as > possible, which says EINVAL is returned for invalid flags (as opposed to > ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW). > I have a sketch of a glibc patch that I haven't even compiled yet, but > seems fairly straight-forward: > > diff --git a/sysdeps/unix/sysv/linux/fchmodat.c b/sysdeps/unix/sysv/linux/fchmodat.c > index 6d9cbc1ce9e0..b1beab76d56c 100644 > --- a/sysdeps/unix/sysv/linux/fchmodat.c > +++ b/sysdeps/unix/sysv/linux/fchmodat.c > @@ -29,12 +29,36 @@ > int > fchmodat (int fd, const char *file, mode_t mode, int flag) > { > - if (flag & ~AT_SYMLINK_NOFOLLOW) > - return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL); > -#ifndef __NR_lchmod /* Linux so far has no lchmod syscall. */ > + /* There are four paths through this code: > + - The flags are zero. In this case it's fine to call fchmodat. > + - The flags are non-zero and glibc doesn't have access to > + __NR_fchmodat4. In this case all we can do is emulate the error codes > + defined by the glibc interface from userspace. > + - The flags are non-zero, glibc has __NR_fchmodat4, and the kernel has > + fchmodat4. This is the simplest case, as the fchmodat4 syscall exactly > + matches glibc's library interface so it can be called directly. > + - The flags are non-zero, glibc has __NR_fchmodat4, but the kernel does If you define __NR_fchmodat4 on all architectures, we can use these constants directly in glibc. We no longer depend on the UAPI definitions of those constants, to cut down the number of code variants, and to make glibc's system call profile independent of the kernel header version at build time. Your version is based on 2.31, more recent versions have some reasonable emulation for fchmodat based on /proc/self/fd. I even wrote a comment describing the same buggy behavior that you witnessed: + /* Some Linux versions with some file systems can actually + change symbolic link permissions via /proc, but this is not + intentional, and it gives inconsistent results (e.g., error + return despite mode change). The expected behavior is that + symbolic link modes cannot be changed at all, and this check + enforces that. */ + if (S_ISLNK (st.st_mode)) + { + __close_nocancel (pathfd); + __set_errno (EOPNOTSUPP); + return -1; + } I think there was some kernel discussion about that behavior before, but apparently, it hasn't led to fixes. I wonder if it makes sense to add a similar error return to the system call implementation? > + not. In this case we must respect the error codes defined by the glibc > + interface instead of returning ENOSYS. > + The intent here is to ensure that the kernel is called at most once per > + library call, and that the error types defined by glibc are always > + respected. */ > + > +#ifdef __NR_fchmodat4 > + long result; > +#endif > + > + if (flag == 0) > + return INLINE_SYSCALL (fchmodat, 3, fd, file, mode); > + > +#ifdef __NR_fchmodat4 > + result = INLINE_SYSCALL (fchmodat4, 4, fd, file, mode, flag); > + if (result == 0 || errno != ENOSYS) > + return result; > +#endif The last if condition is the recommended approach, but in the past, it broke container host compatibility pretty badly due to seccomp filters that return EPERM instead of ENOSYS. I guess we'll learn soon enough if that's been fixed by now. 8-P Thanks, Florian From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9B51AEB64DC for ; Tue, 11 Jul 2023 12:38:41 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=iaLj/w/b; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=iaLj/w/b; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4R0gQm0kTGz3c7v for ; Tue, 11 Jul 2023 22:38:40 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=iaLj/w/b; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=iaLj/w/b; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=redhat.com (client-ip=170.10.133.124; helo=us-smtp-delivery-124.mimecast.com; envelope-from=fweimer@redhat.com; receiver=lists.ozlabs.org) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4R0g7J0vhMz30K1 for ; Tue, 11 Jul 2023 22:25:15 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689078312; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=xjRw+jQ/3UZ/6JfODln6k9FlUepnlRP0EJXqY57Twrc=; b=iaLj/w/bLQGROJGlJU+JizBSbrE2BsScB9+ftMax/xZL7SowLlVDAUMpxbvyd8VqiIdjpi wRvkfDc/DBjz8i7nX621VWHqZAwOUSLyGqPBwB4I6gcsMvdKQojLngQfAecaX33F+AgEdm KQ6NYjyquxLyUGhH8MGi/uMY529XLnA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1689078312; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=xjRw+jQ/3UZ/6JfODln6k9FlUepnlRP0EJXqY57Twrc=; b=iaLj/w/bLQGROJGlJU+JizBSbrE2BsScB9+ftMax/xZL7SowLlVDAUMpxbvyd8VqiIdjpi wRvkfDc/DBjz8i7nX621VWHqZAwOUSLyGqPBwB4I6gcsMvdKQojLngQfAecaX33F+AgEdm KQ6NYjyquxLyUGhH8MGi/uMY529XLnA= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-262-7TmWQ9O4NwC7vHZLlcVKgA-1; Tue, 11 Jul 2023 08:25:06 -0400 X-MC-Unique: 7TmWQ9O4NwC7vHZLlcVKgA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5E5E628EC108; Tue, 11 Jul 2023 12:25:04 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.2.16.46]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 96DBB200AD6E; Tue, 11 Jul 2023 12:24:53 +0000 (UTC) From: Florian Weimer To: Alexey Gladkov Subject: Re: [PATCH v3 0/5] Add a new fchmodat4() syscall References: <87o8pscpny.fsf@oldenburg2.str.redhat.com> Date: Tue, 11 Jul 2023 14:24:51 +0200 In-Reply-To: (Alexey Gladkov's message of "Tue, 11 Jul 2023 13:25:41 +0200") Message-ID: <87lefmbppo.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mailman-Approved-At: Tue, 11 Jul 2023 22:37:01 +1000 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dalias@libc.org, linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org, alexander.shishkin@linux.intel.com, catalin.marinas@arm.com, peterz@infradead.org, stefan@agner.ch, ldv@altlinux.org, dhowells@redhat.com, kim.phillips@arm.com, paulus@samba.org, deepa.kernel@gmail.com, hpa@zytor.com, sparclinux@vger.kernel.org, will@kernel.org, linux-arch@vger.kernel.org, linux-s390@vger.kernel.org, hare@suse.com, gor@linux.ibm.com, ysato@users.sourceforge.jp, deller@gmx.de, x86@kernel.org, linux@armlinux.org.uk, borntraeger@de.ibm.com, mingo@redhat.com, geert@linux-m68k.org, linux-arm-kernel@lists.infradead.org, jhogan@kernel.org, mattst88@gmail.com, linux-mips@vger.kernel.org, fenghua.yu@intel.com, Arnd Bergmann , glebfm@altlinux.org, tycho@tycho.ws, acme@kernel.org, linux-m68k@lists.linux-m68k.org, bp@alien8.de, viro@zeniv.linux.org.uk, luto@kernel.org, namhyung@kernel.org, tglx@linutronix.de, christian@brauner.io, rth@twiddle.net, axboe@kernel.dk, James.Bottomley@HansenPart nership.com, monstr@monstr.eu, tony.luck@intel.com, linux-parisc@vger.kernel.org, linux-api@vger.kernel.org, LKML , ralf@linux-mips.org, paul.burton@mips.com, linux-alpha@vger.kernel.org, linux-fsdevel@vger.kernel.org, ink@jurassic.park.msu.ru, linuxppc-dev@lists.ozlabs.org, davem@davemloft.net Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" * Alexey Gladkov: > This patch set adds fchmodat4(), a new syscall. The actual > implementation is super simple: essentially it's just the same as > fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags. > I've attempted to make this match "man 2 fchmodat" as closely as > possible, which says EINVAL is returned for invalid flags (as opposed to > ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW). > I have a sketch of a glibc patch that I haven't even compiled yet, but > seems fairly straight-forward: > > diff --git a/sysdeps/unix/sysv/linux/fchmodat.c b/sysdeps/unix/sysv/linux/fchmodat.c > index 6d9cbc1ce9e0..b1beab76d56c 100644 > --- a/sysdeps/unix/sysv/linux/fchmodat.c > +++ b/sysdeps/unix/sysv/linux/fchmodat.c > @@ -29,12 +29,36 @@ > int > fchmodat (int fd, const char *file, mode_t mode, int flag) > { > - if (flag & ~AT_SYMLINK_NOFOLLOW) > - return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL); > -#ifndef __NR_lchmod /* Linux so far has no lchmod syscall. */ > + /* There are four paths through this code: > + - The flags are zero. In this case it's fine to call fchmodat. > + - The flags are non-zero and glibc doesn't have access to > + __NR_fchmodat4. In this case all we can do is emulate the error codes > + defined by the glibc interface from userspace. > + - The flags are non-zero, glibc has __NR_fchmodat4, and the kernel has > + fchmodat4. This is the simplest case, as the fchmodat4 syscall exactly > + matches glibc's library interface so it can be called directly. > + - The flags are non-zero, glibc has __NR_fchmodat4, but the kernel does If you define __NR_fchmodat4 on all architectures, we can use these constants directly in glibc. We no longer depend on the UAPI definitions of those constants, to cut down the number of code variants, and to make glibc's system call profile independent of the kernel header version at build time. Your version is based on 2.31, more recent versions have some reasonable emulation for fchmodat based on /proc/self/fd. I even wrote a comment describing the same buggy behavior that you witnessed: + /* Some Linux versions with some file systems can actually + change symbolic link permissions via /proc, but this is not + intentional, and it gives inconsistent results (e.g., error + return despite mode change). The expected behavior is that + symbolic link modes cannot be changed at all, and this check + enforces that. */ + if (S_ISLNK (st.st_mode)) + { + __close_nocancel (pathfd); + __set_errno (EOPNOTSUPP); + return -1; + } I think there was some kernel discussion about that behavior before, but apparently, it hasn't led to fixes. I wonder if it makes sense to add a similar error return to the system call implementation? > + not. In this case we must respect the error codes defined by the glibc > + interface instead of returning ENOSYS. > + The intent here is to ensure that the kernel is called at most once per > + library call, and that the error types defined by glibc are always > + respected. */ > + > +#ifdef __NR_fchmodat4 > + long result; > +#endif > + > + if (flag == 0) > + return INLINE_SYSCALL (fchmodat, 3, fd, file, mode); > + > +#ifdef __NR_fchmodat4 > + result = INLINE_SYSCALL (fchmodat4, 4, fd, file, mode, flag); > + if (result == 0 || errno != ENOSYS) > + return result; > +#endif The last if condition is the recommended approach, but in the past, it broke container host compatibility pretty badly due to seccomp filters that return EPERM instead of ENOSYS. I guess we'll learn soon enough if that's been fixed by now. 8-P Thanks, Florian