* [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
@ 2025-04-29 16:37 David Howells
2025-04-29 17:35 ` Jeffrey E Altman
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: David Howells @ 2025-04-29 16:37 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner
Cc: dhowells, Etienne Champetier, Marc Dionne, Jeffrey Altman,
Chet Ramey, Steve French, linux-afs, openafs-devel, linux-cifs,
linux-fsdevel, linux-kernel
Bash has a work around in redir_open() that causes open(O_CREAT) of a file
in a sticky directory to be retried without O_CREAT if bash was built with
AFS workarounds configured:
#if defined (AFS)
if ((fd < 0) && (errno == EACCES))
{
fd = open (filename, flags & ~O_CREAT, mode);
errno = EACCES; /* restore errno */
}
#endif /* AFS */
This works around the kernel not being able to validly check the
current_fsuid() against i_uid on the file or the directory because the
uidspaces of the system and of AFS may well be disjoint. The problem lies
with the uid checks in may_create_in_sticky().
However, the bash work around is going to be removed:
https://git.savannah.gnu.org/cgit/bash.git/tree/redir.c?h=bash-5.3-rc1#n733
Fix this in the kernel by:
(1) Provide an ->is_owned_by_me() inode op, similar to ->permission(),
and, if provided, call that to determine if the caller owns the file
instead of checking the i_uid to current_fsuid().
(2) Provide a ->have_same_owner() inode op, that, if provided, can be
called to see if an inode has the same owner as the parent on the path
walked.
For kafs, use the first hook to check to see if the server indicated the
ADMINISTER bit in the access rights returned by the FS.FetchStatus and
suchlike and the second hook to compare the AFS user IDs retrieved by
FS.FetchStatus (which may not fit in a kuid if AuriStor's YFS variant).
This can be tested by creating a sticky directory (the user must have a
token to do this) and creating a file in it. Then strace bash doing "echo
foo >>file" and look at whether bash does a single, successful O_CREAT open
on the file or whether that one fails and then bash does one without
O_CREAT that succeeds.
Reported-by: Etienne Champetier <champetier.etienne@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Chet Ramey <chet.ramey@case.edu>
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Christian Brauner <brauner@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: linux-afs@lists.infradead.org
cc: openafs-devel@openafs.org
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
fs/afs/dir.c | 2 ++
fs/afs/file.c | 2 ++
fs/afs/internal.h | 3 +++
fs/afs/security.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/namei.c | 22 ++++++++++++++++++----
include/linux/fs.h | 3 +++
6 files changed, 80 insertions(+), 4 deletions(-)
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 9e7b1fe82c27..6360db1673b0 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -65,6 +65,8 @@ const struct inode_operations afs_dir_inode_operations = {
.permission = afs_permission,
.getattr = afs_getattr,
.setattr = afs_setattr,
+ .is_owned_by_me = afs_is_owned_by_me,
+ .have_same_owner = afs_have_same_owner,
};
const struct address_space_operations afs_dir_aops = {
diff --git a/fs/afs/file.c b/fs/afs/file.c
index fc15497608c6..0317f0a36cf2 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -47,6 +47,8 @@ const struct inode_operations afs_file_inode_operations = {
.getattr = afs_getattr,
.setattr = afs_setattr,
.permission = afs_permission,
+ .is_owned_by_me = afs_is_owned_by_me,
+ .have_same_owner = afs_have_same_owner,
};
const struct address_space_operations afs_file_aops = {
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 440b0e731093..fbfbf615abe3 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1495,6 +1495,9 @@ extern struct key *afs_request_key(struct afs_cell *);
extern struct key *afs_request_key_rcu(struct afs_cell *);
extern int afs_check_permit(struct afs_vnode *, struct key *, afs_access_t *);
extern int afs_permission(struct mnt_idmap *, struct inode *, int);
+int afs_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode);
+int afs_have_same_owner(struct mnt_idmap *idmap, struct inode *inode,
+ struct dentry *dentry);
extern void __exit afs_clean_up_permit_cache(void);
/*
diff --git a/fs/afs/security.c b/fs/afs/security.c
index 6a7744c9e2a2..cc9689fbed47 100644
--- a/fs/afs/security.c
+++ b/fs/afs/security.c
@@ -477,6 +477,58 @@ int afs_permission(struct mnt_idmap *idmap, struct inode *inode,
return ret;
}
+/*
+ * Determine if an inode is owned by 'me' - whatever that means for the
+ * filesystem. In the case of AFS, this means that the file is owned by the
+ * AFS user represented by the token (e.g. from a kerberos server) held in a
+ * key. Returns 0 if owned by me, 1 if not; can also return an error.
+ */
+int afs_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode)
+{
+ struct afs_vnode *vnode = AFS_FS_I(inode);
+ afs_access_t access;
+ struct key *key;
+ int ret;
+
+ key = afs_request_key(vnode->volume->cell);
+ if (IS_ERR(key))
+ return PTR_ERR(key);
+
+ /* Get the access rights for the key on this file. */
+ ret = afs_check_permit(vnode, key, &access);
+ if (ret < 0)
+ goto error;
+
+ /* We get the ADMINISTER bit if we own the file. */
+ ret = (access & AFS_ACE_ADMINISTER) ? 0 : 1;
+error:
+ key_put(key);
+ return ret;
+}
+
+/*
+ * Determine if a file has the same owner as its parent - whatever that means
+ * for the filesystem. In the case of AFS, this means comparing their AFS
+ * UIDs. Returns 0 if same, 1 if not same; can also return an error.
+ */
+int afs_have_same_owner(struct mnt_idmap *idmap, struct inode *inode,
+ struct dentry *dentry)
+{
+ struct afs_vnode *vnode = AFS_FS_I(inode);
+ struct dentry *parent;
+ s64 owner;
+
+ /* Get the owner's ID for the directory. Ideally, we'd use RCU to
+ * access the parent rather than getting a ref.
+ */
+ parent = dget_parent(dentry);
+ vnode = AFS_FS_I(d_backing_inode(parent));
+ owner = vnode->status.owner;
+ dput(parent);
+
+ return vnode->status.owner != owner;
+}
+
void __exit afs_clean_up_permit_cache(void)
{
int i;
diff --git a/fs/namei.c b/fs/namei.c
index 84a0e0b0111c..79e8ef1b6900 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1318,11 +1318,25 @@ static int may_create_in_sticky(struct mnt_idmap *idmap, struct nameidata *nd,
i_vfsuid = i_uid_into_vfsuid(idmap, inode);
- if (vfsuid_eq(i_vfsuid, dir_vfsuid))
- return 0;
+ if (unlikely(inode->i_op->have_same_owner)) {
+ int ret = inode->i_op->have_same_owner(idmap, inode, nd->path.dentry);
- if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
- return 0;
+ if (ret <= 0)
+ return ret;
+ } else {
+ if (vfsuid_eq(i_vfsuid, dir_vfsuid))
+ return 0;
+ }
+
+ if (unlikely(inode->i_op->is_owned_by_me)) {
+ int ret = inode->i_op->is_owned_by_me(idmap, inode);
+
+ if (ret <= 0)
+ return ret;
+ } else {
+ if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
+ return 0;
+ }
if (likely(dir_mode & 0002)) {
audit_log_path_denied(AUDIT_ANOM_CREAT, "sticky_create");
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 016b0fe1536e..ec278d2d362a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2236,6 +2236,9 @@ struct inode_operations {
struct dentry *dentry, struct fileattr *fa);
int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);
struct offset_ctx *(*get_offset_ctx)(struct inode *inode);
+ int (*is_owned_by_me)(struct mnt_idmap *idmap, struct inode *inode);
+ int (*have_same_owner)(struct mnt_idmap *idmap, struct inode *inode,
+ struct dentry *dentry);
} ____cacheline_aligned;
static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-04-29 16:37 [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir David Howells
@ 2025-04-29 17:35 ` Jeffrey E Altman
2025-04-30 15:09 ` Chet Ramey
2025-04-30 16:14 ` David Howells
2025-05-05 13:14 ` Christian Brauner
2025-05-06 10:26 ` David Howells
2 siblings, 2 replies; 16+ messages in thread
From: Jeffrey E Altman @ 2025-04-29 17:35 UTC (permalink / raw)
To: David Howells, Alexander Viro, Christian Brauner
Cc: Etienne Champetier, Marc Dionne, Chet Ramey, Steve French,
linux-afs, openafs-devel, linux-cifs, linux-fsdevel, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 10528 bytes --]
David,
Thanks for this patch. I believe there is one mistake in
afs_have_same_owner().
I've added a bit more background which might be incorporated into a
future commit message.
I have also asked inline about the safety of the use of id-mapped uids
for the ownership checks.
On 4/29/2025 12:37 PM, David Howells wrote:
>
> Bash has a work around in redir_open() that causes open(O_CREAT) of a file
> in a sticky directory to be retried without O_CREAT if bash was built with
> AFS workarounds configured:
>
> #if defined (AFS)
> if ((fd < 0) && (errno == EACCES))
> {
> fd = open (filename, flags & ~O_CREAT, mode);
> errno = EACCES; /* restore errno */
> }
>
> #endif /* AFS */
I think its worth clarifying the purpose of this fallback logic and why
it exists. The fallback
logic was added to bash 1.14.7 as part of the introduction of support
for IBM/Transarc AFS 3.4.
It was noted that sometimes EEXIST would be returned from open(filename,
flags | O_CREAT)
but would succeed if open(filename, flags & ~O_CREAT) was called. There
is no evidence that
the AFS developers were aware of the problem.
I can report that the cause of this behavior is the AFS fileserver's
checking for PRSFS_INSERT
rights on the directory prior to performing the CreateFile action. If
the caller is not permitted
to create a directory entry the action fails with EACCES even if the
requested filename already
exists. The most recent versions of IBM AFS 3.6, OpenAFS, and
AuriStorFS fileservers all
continue to exhibit this behavior today.
This logic predated 30aba6656f61ed44cba445a3c0d38b296fa9e8f5 ("namei:
allow restricted
O_CREAT of FIFOs and regular files") by decades. As a side effect, when
the filesystem is an
in-tree or out-of-tree AFS-family filesystem ...
> This works around the kernel not being able to validly check the
> current_fsuid() against i_uid on the file or the directory because the
> uidspaces of the system and of AFS may well be disjoint. The problem lies
> with the uid checks in may_create_in_sticky().
>
> However, the bash work around is going to be removed:
>
> https://git.savannah.gnu.org/cgit/bash.git/tree/redir.c?h=bash-5.3-rc1#n733
>
> Fix this in the kernel by:
>
> (1) Provide an ->is_owned_by_me() inode op, similar to ->permission(),
> and, if provided, call that to determine if the caller owns the file
> instead of checking the i_uid to current_fsuid().
>
> (2) Provide a ->have_same_owner() inode op, that, if provided, can be
> called to see if an inode has the same owner as the parent on the path
> walked.
>
> For kafs, use the first hook to check to see if the server indicated the
> ADMINISTER bit in the access rights returned by the FS.FetchStatus and
> suchlike
The AFSFetchStatus.CallerAccess field returned for a non-directory
object includes the
PRSFS_ADMINISTER bit set if the caller is authenticated and the
authenticated identity
is the AFS ID returned in the AFSFetchStatus.Owner field.
> and the second hook to compare the AFS user IDs retrieved by
> FS.FetchStatus (which may not fit in a kuid if AuriStor's YFS variant).
Perhaps more importantly, the AFS IDs specified in the
AFSFetchStatus.Owner field often
have no relationship to the local system's uid namespace and might
collide with various
uid ranges which might be used for id-mapping by container managers.
> This can be tested by creating a sticky directory (the user must have a
> token to do this) and creating a file in it. Then strace bash doing "echo
> foo >>file" and look at whether bash does a single, successful O_CREAT open
> on the file or whether that one fails and then bash does one without
> O_CREAT that succeeds.
>
> Reported-by: Etienne Champetier <champetier.etienne@gmail.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: Jeffrey Altman <jaltman@auristor.com>
> cc: Chet Ramey <chet.ramey@case.edu>
> cc: Alexander Viro <viro@zeniv.linux.org.uk>
> cc: Christian Brauner <brauner@kernel.org>
> cc: Steve French <sfrench@samba.org>
> cc: linux-afs@lists.infradead.org
> cc: openafs-devel@openafs.org
> cc: linux-cifs@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---
> fs/afs/dir.c | 2 ++
> fs/afs/file.c | 2 ++
> fs/afs/internal.h | 3 +++
> fs/afs/security.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> fs/namei.c | 22 ++++++++++++++++++----
> include/linux/fs.h | 3 +++
> 6 files changed, 80 insertions(+), 4 deletions(-)
>
> diff --git a/fs/afs/dir.c b/fs/afs/dir.c
> index 9e7b1fe82c27..6360db1673b0 100644
> --- a/fs/afs/dir.c
> +++ b/fs/afs/dir.c
> @@ -65,6 +65,8 @@ const struct inode_operations afs_dir_inode_operations = {
> .permission = afs_permission,
> .getattr = afs_getattr,
> .setattr = afs_setattr,
> + .is_owned_by_me = afs_is_owned_by_me,
> + .have_same_owner = afs_have_same_owner,
> };
>
> const struct address_space_operations afs_dir_aops = {
> diff --git a/fs/afs/file.c b/fs/afs/file.c
> index fc15497608c6..0317f0a36cf2 100644
> --- a/fs/afs/file.c
> +++ b/fs/afs/file.c
> @@ -47,6 +47,8 @@ const struct inode_operations afs_file_inode_operations = {
> .getattr = afs_getattr,
> .setattr = afs_setattr,
> .permission = afs_permission,
> + .is_owned_by_me = afs_is_owned_by_me,
> + .have_same_owner = afs_have_same_owner,
> };
>
> const struct address_space_operations afs_file_aops = {
> diff --git a/fs/afs/internal.h b/fs/afs/internal.h
> index 440b0e731093..fbfbf615abe3 100644
> --- a/fs/afs/internal.h
> +++ b/fs/afs/internal.h
> @@ -1495,6 +1495,9 @@ extern struct key *afs_request_key(struct afs_cell *);
> extern struct key *afs_request_key_rcu(struct afs_cell *);
> extern int afs_check_permit(struct afs_vnode *, struct key *, afs_access_t *);
> extern int afs_permission(struct mnt_idmap *, struct inode *, int);
> +int afs_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode);
> +int afs_have_same_owner(struct mnt_idmap *idmap, struct inode *inode,
> + struct dentry *dentry);
> extern void __exit afs_clean_up_permit_cache(void);
>
> /*
> diff --git a/fs/afs/security.c b/fs/afs/security.c
> index 6a7744c9e2a2..cc9689fbed47 100644
> --- a/fs/afs/security.c
> +++ b/fs/afs/security.c
> @@ -477,6 +477,58 @@ int afs_permission(struct mnt_idmap *idmap, struct inode *inode,
> return ret;
> }
>
> +/*
> + * Determine if an inode is owned by 'me' - whatever that means for the
> + * filesystem. In the case of AFS, this means that the file is owned by the
> + * AFS user represented by the token (e.g. from a kerberos server) held in a
> + * key. Returns 0 if owned by me, 1 if not; can also return an error.
> + */
> +int afs_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode)
> +{
> + struct afs_vnode *vnode = AFS_FS_I(inode);
> + afs_access_t access;
> + struct key *key;
> + int ret;
> +
> + key = afs_request_key(vnode->volume->cell);
> + if (IS_ERR(key))
> + return PTR_ERR(key);
> +
> + /* Get the access rights for the key on this file. */
> + ret = afs_check_permit(vnode, key, &access);
> + if (ret < 0)
> + goto error;
> +
> + /* We get the ADMINISTER bit if we own the file. */
> + ret = (access & AFS_ACE_ADMINISTER) ? 0 : 1;
> +error:
> + key_put(key);
> + return ret;
> +}
> +
> +/*
> + * Determine if a file has the same owner as its parent - whatever that means
> + * for the filesystem. In the case of AFS, this means comparing their AFS
> + * UIDs. Returns 0 if same, 1 if not same; can also return an error.
> + */
> +int afs_have_same_owner(struct mnt_idmap *idmap, struct inode *inode,
> + struct dentry *dentry)
> +{
> + struct afs_vnode *vnode = AFS_FS_I(inode);
> + struct dentry *parent;
> + s64 owner;
> +
> + /* Get the owner's ID for the directory. Ideally, we'd use RCU to
> + * access the parent rather than getting a ref.
> + */
> + parent = dget_parent(dentry);
> + vnode = AFS_FS_I(d_backing_inode(parent));
This line is overwriting 'vnode' with the parent. I think there needs
to be a separate pvnode or something.
> + owner = vnode->status.owner;
> + dput(parent);
> +
> + return vnode->status.owner != owner;
> +}
> +
> void __exit afs_clean_up_permit_cache(void)
> {
> int i;
> diff --git a/fs/namei.c b/fs/namei.c
> index 84a0e0b0111c..79e8ef1b6900 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1318,11 +1318,25 @@ static int may_create_in_sticky(struct mnt_idmap *idmap, struct nameidata *nd,
>
> i_vfsuid = i_uid_into_vfsuid(idmap, inode);
Unrelated to this change but is use of id-mapped uid values safe for
this purpose?
Isn't it possible for more than one uid to be mapped to the same vfsuid
value?=
> - if (vfsuid_eq(i_vfsuid, dir_vfsuid))
> - return 0;
> + if (unlikely(inode->i_op->have_same_owner)) {
> + int ret = inode->i_op->have_same_owner(idmap, inode, nd->path.dentry);
>
> - if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
> - return 0;
> + if (ret <= 0)
> + return ret;
> + } else {
> + if (vfsuid_eq(i_vfsuid, dir_vfsuid))
> + return 0;
> + }
> +
> + if (unlikely(inode->i_op->is_owned_by_me)) {
> + int ret = inode->i_op->is_owned_by_me(idmap, inode);
> +
> + if (ret <= 0)
> + return ret;
> + } else {
> + if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
> + return 0;
> + }
>
> if (likely(dir_mode & 0002)) {
> audit_log_path_denied(AUDIT_ANOM_CREAT, "sticky_create");
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 016b0fe1536e..ec278d2d362a 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2236,6 +2236,9 @@ struct inode_operations {
> struct dentry *dentry, struct fileattr *fa);
> int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);
> struct offset_ctx *(*get_offset_ctx)(struct inode *inode);
> + int (*is_owned_by_me)(struct mnt_idmap *idmap, struct inode *inode);
> + int (*have_same_owner)(struct mnt_idmap *idmap, struct inode *inode,
> + struct dentry *dentry);
> } ____cacheline_aligned;
>
> static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
Jeffrey Altman
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4276 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-04-29 17:35 ` Jeffrey E Altman
@ 2025-04-30 15:09 ` Chet Ramey
2025-04-30 16:14 ` David Howells
1 sibling, 0 replies; 16+ messages in thread
From: Chet Ramey @ 2025-04-30 15:09 UTC (permalink / raw)
To: Jeffrey E Altman, David Howells, Alexander Viro,
Christian Brauner
Cc: chet.ramey, Etienne Champetier, Marc Dionne, Steve French,
linux-afs, openafs-devel, linux-cifs, linux-fsdevel, linux-kernel
On 4/29/25 1:35 PM, Jeffrey E Altman wrote:
> I think its worth clarifying the purpose of this fallback logic and why it
> exists. The fallback
> logic was added to bash 1.14.7 as part of the introduction of support for
> IBM/Transarc AFS 3.4.
The chronology is wrong. The workaround came in in January, 1992, when
bash-1.11 was current and IBM released AFS 3.1. (The bug was actually
encountered with bash-1.08.)
The old code, without the workaround, caused widespread mail delivery
failures at CMU, who reported the problem to me and (they claimed at the
time) IBM, and provided the patch.
> It was noted that sometimes EEXIST would be returned from open(filename,
> flags | O_CREAT)
> but would succeed if open(filename, flags & ~O_CREAT) was called. There is
> no evidence that
> the AFS developers were aware of the problem.
Well, except for CMU's report.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-04-29 17:35 ` Jeffrey E Altman
2025-04-30 15:09 ` Chet Ramey
@ 2025-04-30 16:14 ` David Howells
2025-04-30 17:26 ` Jeffrey E Altman
1 sibling, 1 reply; 16+ messages in thread
From: David Howells @ 2025-04-30 16:14 UTC (permalink / raw)
To: chet.ramey
Cc: dhowells, Jeffrey E Altman, Alexander Viro, Christian Brauner,
Etienne Champetier, Marc Dionne, Steve French, linux-afs,
openafs-devel, linux-cifs, linux-fsdevel, linux-kernel
Chet Ramey <chet.ramey@case.edu> wrote:
> Well, except for CMU's report.
Do you know of any link for that? I'm guessing that is it was 1992, there may
be no online record of it.
David
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-04-30 16:14 ` David Howells
@ 2025-04-30 17:26 ` Jeffrey E Altman
2025-04-30 18:36 ` Chet Ramey
0 siblings, 1 reply; 16+ messages in thread
From: Jeffrey E Altman @ 2025-04-30 17:26 UTC (permalink / raw)
To: David Howells, chet.ramey
Cc: Alexander Viro, Christian Brauner, Etienne Champetier,
Marc Dionne, Steve French, linux-afs, openafs-devel, linux-cifs,
linux-fsdevel, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 331 bytes --]
On 4/30/2025 12:14 PM, David Howells wrote:
> Chet Ramey <chet.ramey@case.edu> wrote:
>
>> Well, except for CMU's report.
> Do you know of any link for that? I'm guessing that is it was 1992, there may
> be no online record of it.
>
> David
https://groups.google.com/g/gnu.bash.bug/c/6PPTfOgFdL4/m/2AQU-S1N76UJ?hl=en
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4276 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-04-30 17:26 ` Jeffrey E Altman
@ 2025-04-30 18:36 ` Chet Ramey
0 siblings, 0 replies; 16+ messages in thread
From: Chet Ramey @ 2025-04-30 18:36 UTC (permalink / raw)
To: Jeffrey E Altman, David Howells
Cc: chet.ramey, Alexander Viro, Christian Brauner, Etienne Champetier,
Marc Dionne, Steve French, linux-afs, openafs-devel, linux-cifs,
linux-fsdevel, linux-kernel
On 4/30/25 1:26 PM, Jeffrey E Altman wrote:
> On 4/30/2025 12:14 PM, David Howells wrote:
>> Chet Ramey <chet.ramey@case.edu> wrote:
>>
>>> Well, except for CMU's report.
>> Do you know of any link for that? I'm guessing that is it was 1992,
>> there may
>> be no online record of it.
>>
>> David
>
> https://groups.google.com/g/gnu.bash.bug/c/6PPTfOgFdL4/m/2AQU-S1N76UJ?hl=en
Which of course just claims they reported it, but doesn't include the
report itself.
But Jeffrey's message seems to indicate that IBM addressed this particular
issue in AFS 3.2.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-04-29 16:37 [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir David Howells
2025-04-29 17:35 ` Jeffrey E Altman
@ 2025-05-05 13:14 ` Christian Brauner
[not found] ` <CAOdf3grbDQ-Fh2bV7XfoYvVBhgBAh7-hZyyxTNt1RfGekrA-nA@mail.gmail.com>
2025-05-14 12:49 ` Chet Ramey
2025-05-06 10:26 ` David Howells
2 siblings, 2 replies; 16+ messages in thread
From: Christian Brauner @ 2025-05-05 13:14 UTC (permalink / raw)
To: David Howells
Cc: Alexander Viro, Etienne Champetier, Marc Dionne, Jeffrey Altman,
Chet Ramey, Steve French, linux-afs, openafs-devel, linux-cifs,
linux-fsdevel, linux-kernel
On Tue, Apr 29, 2025 at 05:37:31PM +0100, David Howells wrote:
>
> Bash has a work around in redir_open() that causes open(O_CREAT) of a file
> in a sticky directory to be retried without O_CREAT if bash was built with
> AFS workarounds configured:
>
> #if defined (AFS)
> if ((fd < 0) && (errno == EACCES))
> {
> fd = open (filename, flags & ~O_CREAT, mode);
> errno = EACCES; /* restore errno */
> }
>
> #endif /* AFS */
>
> This works around the kernel not being able to validly check the
> current_fsuid() against i_uid on the file or the directory because the
> uidspaces of the system and of AFS may well be disjoint. The problem lies
> with the uid checks in may_create_in_sticky().
>
> However, the bash work around is going to be removed:
Why is it removed? That's a very strange comment:
#if 0 /* reportedly no longer needed */
So then just don't remove it. I don't see a reason for us to workaround
userspace creating a bug for itself and forcing us to add two new inode
operations to work around it.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
[not found] ` <CAOdf3grbDQ-Fh2bV7XfoYvVBhgBAh7-hZyyxTNt1RfGekrA-nA@mail.gmail.com>
@ 2025-05-05 14:42 ` Jeffrey E Altman
2025-05-14 12:50 ` Chet Ramey
0 siblings, 1 reply; 16+ messages in thread
From: Jeffrey E Altman @ 2025-05-05 14:42 UTC (permalink / raw)
To: Etienne Champetier, Christian Brauner
Cc: David Howells, Alexander Viro, Marc Dionne, Chet Ramey,
Steve French, linux-fsdevel, Linux AFS mailing list, linux-kernel,
linux-cifs, openafs-devel@openafs.org
[-- Attachment #1: Type: text/plain, Size: 3353 bytes --]
On 5/5/2025 10:02 AM, Etienne Champetier wrote:
> Hello,
>
> Removing lists, feel free to add them back
>
> Le lun. 5 mai 2025 à 09:14, Christian Brauner <brauner@kernel.org> a écrit :
>> Why is it removed? That's a very strange comment:
>>
>> #if 0 /* reportedly no longer needed */
>>
>> So then just don't remove it. I don't see a reason for us to workaround
>> userspace creating a bug for itself and forcing us to add two new inode
>> operations to work around it.
> This bash workaround introduced ages ago for AFS bypass fs.protected_regular
Chet, I don't think this history is correct. The bash workaround was
introduced in 1992 to workaround a behavior when appending to restricted
access directories stored in IBM AFS 3.1[1] and the Linux kernel's
30aba6656f61ed44cba445a3c0d38b296fa9e8f5 wasn't added until 2018.
IBM AFS 3.2 addressed the narrow use case described by the bug report by
implementing a potentially racy change to the AFS cache manager and
failing to address the server side. However, that is out of scope for
this discussion. To the extent that there is a bug in one or more of
the AFS server implementations it should be fixed there.
The bash fallback logic to retry the open without O_CREAT introduces a
bypass for the kernel mode protection provided by 30aba6656f61 and
should be removed.
Christian,
It just so happens that the workaround added to bash in 1992 masks an
incompatibility introduced by 30aba6656f61 when the backing filesystem
is "afs" because the ownership checks required by may_create_in_sticky()
cannot be reliably performed based upon the kernel's local knowledge of
the uids. Ownership checks in "afs" are performed by the fileserver's
evaluation of the caller's rxgk or rxkad security tokens and not by use
of uids. This incompatibility was only noticed after Red Hat began
enabling fs.protected_regular by default and bash removed the fallback
logic in the proposed 5.3 release candidates.
The proposed inode operations are to permit filesystems such as AFS
which cannot rely upon the kernel's local uid knowledge to perform the
required the ownership checks to perform those checks via another
mechanism. In the case of AFS, the fileserver already conveys the
answer to the "is inode owned by me?" question as part of its delivery
of caller access rights (AFSFetchStatus.CallerAccess). The answer to
the "do these two inodes have the same owner?" question can be
determined via comparison of the AFSFetchStatus.Owner fields for each
inode which belong to a uid namespace that is specific the the AFS cell
in which the inodes are stored. When performing this ownership check
for network filesystems I do not believe it is safe to assume that the
uid namespace of the network filesystem is identical to the local
machine's uid namespace. I think it would be safer for all network
filesystems to answer the ownership questions using network uid values
instead of local uid values when available.
I'm also concerned about using id-mapped values for this comparison
because there is no restriction preventing two distinct id values from
being mapped to the same id.
Sincerely,
Jeffrey Altman
[1] https://groups.google.com/g/gnu.bash.bug/c/6PPTfOgFdL4/m/2AQU-S1N76UJ
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4276 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-04-29 16:37 [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir David Howells
2025-04-29 17:35 ` Jeffrey E Altman
2025-05-05 13:14 ` Christian Brauner
@ 2025-05-06 10:26 ` David Howells
2025-05-09 10:33 ` Christian Brauner
2025-05-12 13:02 ` [PATCH v2] " David Howells
2 siblings, 2 replies; 16+ messages in thread
From: David Howells @ 2025-05-06 10:26 UTC (permalink / raw)
To: Christian Brauner
Cc: dhowells, Alexander Viro, Etienne Champetier, Marc Dionne,
Jeffrey Altman, Chet Ramey, Steve French, linux-afs,
openafs-devel, linux-cifs, linux-fsdevel, linux-kernel
Christian Brauner <brauner@kernel.org> wrote:
> > However, the bash work around is going to be removed:
>
> Why is it removed? That's a very strange comment:
Because it makes bash output redirection work differently to other programs, I
would guess. It's actually a simple security check to work around (just retry
the open() with O_CREAT dropped) - however, it does expose an... error, I
suppose, in the Linux kernel: namely that the VFS itself is treating foreign
files as if they had local system ownership.
We have the ->permission() inode op for this reason (I presume) - but that
only applies to certain checks. The VFS must not assume that it can interpret
i_uid and i_gid on an inode and must not assume that it can compare them to
current->fsuid and current->fs_gid.
Now, in my patch, I added two inode ops because they VFS code involved makes
two distinct evaluations and so I made an op for each and, as such, those
evaluations may be applicable elsewhere, but I could make a combined op that
handles that specific situation instead.
David
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-05-06 10:26 ` David Howells
@ 2025-05-09 10:33 ` Christian Brauner
2025-05-12 13:02 ` [PATCH v2] " David Howells
1 sibling, 0 replies; 16+ messages in thread
From: Christian Brauner @ 2025-05-09 10:33 UTC (permalink / raw)
To: David Howells
Cc: Alexander Viro, Etienne Champetier, Marc Dionne, Jeffrey Altman,
Chet Ramey, Steve French, linux-afs, openafs-devel, linux-cifs,
linux-fsdevel, linux-kernel
On Tue, May 06, 2025 at 11:26:30AM +0100, David Howells wrote:
> Christian Brauner <brauner@kernel.org> wrote:
>
> > > However, the bash work around is going to be removed:
> >
> > Why is it removed? That's a very strange comment:
>
> Because it makes bash output redirection work differently to other programs, I
> would guess. It's actually a simple security check to work around (just retry
> the open() with O_CREAT dropped) - however, it does expose an... error, I
> suppose, in the Linux kernel: namely that the VFS itself is treating foreign
> files as if they had local system ownership.
>
> We have the ->permission() inode op for this reason (I presume) - but that
> only applies to certain checks. The VFS must not assume that it can interpret
> i_uid and i_gid on an inode and must not assume that it can compare them to
> current->fsuid and current->fs_gid.
>
> Now, in my patch, I added two inode ops because they VFS code involved makes
> two distinct evaluations and so I made an op for each and, as such, those
> evaluations may be applicable elsewhere, but I could make a combined op that
> handles that specific situation instead.
Try to make it one, please.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-05-06 10:26 ` David Howells
2025-05-09 10:33 ` Christian Brauner
@ 2025-05-12 13:02 ` David Howells
2025-05-13 7:49 ` Christian Brauner
2025-05-13 8:30 ` David Howells
1 sibling, 2 replies; 16+ messages in thread
From: David Howells @ 2025-05-12 13:02 UTC (permalink / raw)
To: Christian Brauner
Cc: dhowells, Alexander Viro, Etienne Champetier, Marc Dionne,
Jeffrey Altman, Chet Ramey, Steve French, linux-afs,
openafs-devel, linux-cifs, linux-fsdevel, linux-kernel
Christian Brauner <brauner@kernel.org> wrote:
> > Now, in my patch, I added two inode ops because they VFS code involved makes
> > two distinct evaluations and so I made an op for each and, as such, those
> > evaluations may be applicable elsewhere, but I could make a combined op that
> > handles that specific situation instead.
>
> Try to make it one, please.
Okay, see attached.
David
----
Bash has a work around in redir_open() that causes open(O_CREAT) of a file
in a sticky directory to be retried without O_CREAT if bash was built with
AFS workarounds configured:
#if defined (AFS)
if ((fd < 0) && (errno == EACCES))
{
fd = open (filename, flags & ~O_CREAT, mode);
errno = EACCES; /* restore errno */
}
#endif /* AFS */
This works around the kernel not being able to validly check the
current_fsuid() against i_uid on the file or the directory because the
uidspaces of the system and of AFS may well be disjoint. The problem lies
with the uid checks in may_create_in_sticky().
However, the bash work around is going to be removed:
https://git.savannah.gnu.org/cgit/bash.git/tree/redir.c?h=bash-5.3-rc1#n733
Fix this in the kernel by providing a ->may_create_in_sticky() inode op,
similar to ->permission(), that, if provided, is called to:
(1) see if an inode has the same owner as the parent on the path walked;
(2) determine if the caller owns the file instead of checking the i_uid to
current_fsuid().
For kafs, the hook is implemented to see if:
(1) the AFS owner IDs retrieved on the file and its parent directory by
FS.FetchStatus match;
(2) if the server set the ADMINISTER bit in the access rights returned by
the FS.FetchStatus and suchlike for the key, indicating ownership by
the user specified by the key.
(Note that the owner IDs retrieved from an AuriStor YFS server may not fit
in the kuid_t being 64-bit, so they need comparing directly).
This can be tested by creating a sticky directory (the user must have a
token to do this) and creating a file in it. Then strace bash doing "echo
foo >>file" and look at whether bash does a single, successful O_CREAT open
on the file or whether that one fails and then bash does one without
O_CREAT that succeeds.
Reported-by: Etienne Champetier <champetier.etienne@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Jeffrey Altman <jaltman@auristor.com>
cc: Chet Ramey <chet.ramey@case.edu>
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Christian Brauner <brauner@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: linux-afs@lists.infradead.org
cc: openafs-devel@openafs.org
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
---
fs/afs/dir.c | 1 +
fs/afs/file.c | 1 +
fs/afs/internal.h | 2 ++
fs/afs/security.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/namei.c | 17 ++++++++++++-----
include/linux/fs.h | 2 ++
6 files changed, 70 insertions(+), 5 deletions(-)
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 9e7b1fe82c27..27e565612bde 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -65,6 +65,7 @@ const struct inode_operations afs_dir_inode_operations = {
.permission = afs_permission,
.getattr = afs_getattr,
.setattr = afs_setattr,
+ .may_create_in_sticky = afs_may_create_in_sticky,
};
const struct address_space_operations afs_dir_aops = {
diff --git a/fs/afs/file.c b/fs/afs/file.c
index fc15497608c6..dff48d0adec3 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -47,6 +47,7 @@ const struct inode_operations afs_file_inode_operations = {
.getattr = afs_getattr,
.setattr = afs_setattr,
.permission = afs_permission,
+ .may_create_in_sticky = afs_may_create_in_sticky,
};
const struct address_space_operations afs_file_aops = {
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 440b0e731093..4a5bb01606a8 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1495,6 +1495,8 @@ extern struct key *afs_request_key(struct afs_cell *);
extern struct key *afs_request_key_rcu(struct afs_cell *);
extern int afs_check_permit(struct afs_vnode *, struct key *, afs_access_t *);
extern int afs_permission(struct mnt_idmap *, struct inode *, int);
+int afs_may_create_in_sticky(struct mnt_idmap *idmap, struct inode *inode,
+ struct path *path);
extern void __exit afs_clean_up_permit_cache(void);
/*
diff --git a/fs/afs/security.c b/fs/afs/security.c
index 6a7744c9e2a2..9fd6e4b5c228 100644
--- a/fs/afs/security.c
+++ b/fs/afs/security.c
@@ -477,6 +477,58 @@ int afs_permission(struct mnt_idmap *idmap, struct inode *inode,
return ret;
}
+/*
+ * Perform the ownership checks for a file in a sticky directory on AFS.
+ *
+ * In the case of AFS, this means that:
+ *
+ * (1) the file and the directory have the same AFS ownership or
+ *
+ * (2) the file is owned by the AFS user represented by the token (e.g. from a
+ * kerberos server) held in a key.
+ *
+ * Returns 0 if owned by me or has same owner as parent dir, 1 if not; can also
+ * return an error.
+ */
+int afs_may_create_in_sticky(struct mnt_idmap *idmap, struct inode *inode,
+ struct path *path)
+{
+ struct afs_vnode *dvnode, *vnode = AFS_FS_I(inode);
+ struct dentry *parent;
+ struct key *key;
+ afs_access_t access;
+ int ret;
+ s64 owner;
+
+ key = afs_request_key(vnode->volume->cell);
+ if (IS_ERR(key))
+ return PTR_ERR(key);
+
+ /* Get the owner's ID for the directory. Ideally, we'd use RCU to
+ * access the parent rather than getting a ref.
+ */
+ parent = dget_parent(path->dentry);
+ dvnode = AFS_FS_I(d_backing_inode(parent));
+ owner = dvnode->status.owner;
+ dput(parent);
+
+ if (vnode->status.owner == owner) {
+ ret = 0;
+ goto error;
+ }
+
+ /* Get the access rights for the key on this file. */
+ ret = afs_check_permit(vnode, key, &access);
+ if (ret < 0)
+ goto error;
+
+ /* We get the ADMINISTER bit if we own the file. */
+ ret = (access & AFS_ACE_ADMINISTER) ? 1 : 0;
+error:
+ key_put(key);
+ return ret;
+}
+
void __exit afs_clean_up_permit_cache(void)
{
int i;
diff --git a/fs/namei.c b/fs/namei.c
index 84a0e0b0111c..e52c91cbed2a 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1316,13 +1316,20 @@ static int may_create_in_sticky(struct mnt_idmap *idmap, struct nameidata *nd,
if (S_ISFIFO(inode->i_mode) && !sysctl_protected_fifos)
return 0;
- i_vfsuid = i_uid_into_vfsuid(idmap, inode);
+ if (unlikely(inode->i_op->may_create_in_sticky)) {
+ int ret = inode->i_op->may_create_in_sticky(idmap, inode, &nd->path);
- if (vfsuid_eq(i_vfsuid, dir_vfsuid))
- return 0;
+ if (ret <= 0) /* 1 if not owned by me or by parent dir. */
+ return ret;
+ } else {
+ i_vfsuid = i_uid_into_vfsuid(idmap, inode);
- if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
- return 0;
+ if (vfsuid_eq(i_vfsuid, dir_vfsuid))
+ return 0;
+
+ if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
+ return 0;
+ }
if (likely(dir_mode & 0002)) {
audit_log_path_denied(AUDIT_ANOM_CREAT, "sticky_create");
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 016b0fe1536e..11122e169719 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2236,6 +2236,8 @@ struct inode_operations {
struct dentry *dentry, struct fileattr *fa);
int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);
struct offset_ctx *(*get_offset_ctx)(struct inode *inode);
+ int (*may_create_in_sticky)(struct mnt_idmap *idmap, struct inode *inode,
+ struct path *path);
} ____cacheline_aligned;
static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-05-12 13:02 ` [PATCH v2] " David Howells
@ 2025-05-13 7:49 ` Christian Brauner
2025-05-13 8:30 ` David Howells
1 sibling, 0 replies; 16+ messages in thread
From: Christian Brauner @ 2025-05-13 7:49 UTC (permalink / raw)
To: David Howells
Cc: Alexander Viro, Etienne Champetier, Marc Dionne, Jeffrey Altman,
Chet Ramey, Steve French, linux-afs, openafs-devel, linux-cifs,
linux-fsdevel, linux-kernel
On Mon, May 12, 2025 at 02:02:37PM +0100, David Howells wrote:
> Christian Brauner <brauner@kernel.org> wrote:
>
> > > Now, in my patch, I added two inode ops because they VFS code involved makes
> > > two distinct evaluations and so I made an op for each and, as such, those
> > > evaluations may be applicable elsewhere, but I could make a combined op that
> > > handles that specific situation instead.
> >
> > Try to make it one, please.
>
> Okay, see attached.
>
> David
> ----
> Bash has a work around in redir_open() that causes open(O_CREAT) of a file
> in a sticky directory to be retried without O_CREAT if bash was built with
> AFS workarounds configured:
>
> #if defined (AFS)
> if ((fd < 0) && (errno == EACCES))
> {
> fd = open (filename, flags & ~O_CREAT, mode);
> errno = EACCES; /* restore errno */
> }
>
> #endif /* AFS */
>
> This works around the kernel not being able to validly check the
> current_fsuid() against i_uid on the file or the directory because the
> uidspaces of the system and of AFS may well be disjoint. The problem lies
> with the uid checks in may_create_in_sticky().
>
> However, the bash work around is going to be removed:
>
> https://git.savannah.gnu.org/cgit/bash.git/tree/redir.c?h=bash-5.3-rc1#n733
>
> Fix this in the kernel by providing a ->may_create_in_sticky() inode op,
> similar to ->permission(), that, if provided, is called to:
>
> (1) see if an inode has the same owner as the parent on the path walked;
>
> (2) determine if the caller owns the file instead of checking the i_uid to
> current_fsuid().
>
> For kafs, the hook is implemented to see if:
>
> (1) the AFS owner IDs retrieved on the file and its parent directory by
> FS.FetchStatus match;
>
> (2) if the server set the ADMINISTER bit in the access rights returned by
> the FS.FetchStatus and suchlike for the key, indicating ownership by
> the user specified by the key.
>
> (Note that the owner IDs retrieved from an AuriStor YFS server may not fit
> in the kuid_t being 64-bit, so they need comparing directly).
There's a few other places where we compare vfsuids:
* may_delete()
-> check_sticky()
-> __check_sticky()
* may_follow_link()
* may_linkat()
* fsuidgid_has_mapping()
Anyone of those need special treatment on AFS as well?
> This can be tested by creating a sticky directory (the user must have a
> token to do this) and creating a file in it. Then strace bash doing "echo
> foo >>file" and look at whether bash does a single, successful O_CREAT open
> on the file or whether that one fails and then bash does one without
> O_CREAT that succeeds.
>
> Reported-by: Etienne Champetier <champetier.etienne@gmail.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Marc Dionne <marc.dionne@auristor.com>
> cc: Jeffrey Altman <jaltman@auristor.com>
> cc: Chet Ramey <chet.ramey@case.edu>
> cc: Alexander Viro <viro@zeniv.linux.org.uk>
> cc: Christian Brauner <brauner@kernel.org>
> cc: Steve French <sfrench@samba.org>
> cc: linux-afs@lists.infradead.org
> cc: openafs-devel@openafs.org
> cc: linux-cifs@vger.kernel.org
> cc: linux-fsdevel@vger.kernel.org
> ---
> fs/afs/dir.c | 1 +
> fs/afs/file.c | 1 +
> fs/afs/internal.h | 2 ++
> fs/afs/security.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> fs/namei.c | 17 ++++++++++++-----
> include/linux/fs.h | 2 ++
> 6 files changed, 70 insertions(+), 5 deletions(-)
>
> diff --git a/fs/afs/dir.c b/fs/afs/dir.c
> index 9e7b1fe82c27..27e565612bde 100644
> --- a/fs/afs/dir.c
> +++ b/fs/afs/dir.c
> @@ -65,6 +65,7 @@ const struct inode_operations afs_dir_inode_operations = {
> .permission = afs_permission,
> .getattr = afs_getattr,
> .setattr = afs_setattr,
> + .may_create_in_sticky = afs_may_create_in_sticky,
> };
>
> const struct address_space_operations afs_dir_aops = {
> diff --git a/fs/afs/file.c b/fs/afs/file.c
> index fc15497608c6..dff48d0adec3 100644
> --- a/fs/afs/file.c
> +++ b/fs/afs/file.c
> @@ -47,6 +47,7 @@ const struct inode_operations afs_file_inode_operations = {
> .getattr = afs_getattr,
> .setattr = afs_setattr,
> .permission = afs_permission,
> + .may_create_in_sticky = afs_may_create_in_sticky,
> };
>
> const struct address_space_operations afs_file_aops = {
> diff --git a/fs/afs/internal.h b/fs/afs/internal.h
> index 440b0e731093..4a5bb01606a8 100644
> --- a/fs/afs/internal.h
> +++ b/fs/afs/internal.h
> @@ -1495,6 +1495,8 @@ extern struct key *afs_request_key(struct afs_cell *);
> extern struct key *afs_request_key_rcu(struct afs_cell *);
> extern int afs_check_permit(struct afs_vnode *, struct key *, afs_access_t *);
> extern int afs_permission(struct mnt_idmap *, struct inode *, int);
> +int afs_may_create_in_sticky(struct mnt_idmap *idmap, struct inode *inode,
> + struct path *path);
> extern void __exit afs_clean_up_permit_cache(void);
>
> /*
> diff --git a/fs/afs/security.c b/fs/afs/security.c
> index 6a7744c9e2a2..9fd6e4b5c228 100644
> --- a/fs/afs/security.c
> +++ b/fs/afs/security.c
> @@ -477,6 +477,58 @@ int afs_permission(struct mnt_idmap *idmap, struct inode *inode,
> return ret;
> }
>
> +/*
> + * Perform the ownership checks for a file in a sticky directory on AFS.
> + *
> + * In the case of AFS, this means that:
> + *
> + * (1) the file and the directory have the same AFS ownership or
> + *
> + * (2) the file is owned by the AFS user represented by the token (e.g. from a
> + * kerberos server) held in a key.
> + *
> + * Returns 0 if owned by me or has same owner as parent dir, 1 if not; can also
> + * return an error.
> + */
> +int afs_may_create_in_sticky(struct mnt_idmap *idmap, struct inode *inode,
> + struct path *path)
> +{
> + struct afs_vnode *dvnode, *vnode = AFS_FS_I(inode);
> + struct dentry *parent;
> + struct key *key;
> + afs_access_t access;
> + int ret;
> + s64 owner;
> +
> + key = afs_request_key(vnode->volume->cell);
> + if (IS_ERR(key))
> + return PTR_ERR(key);
> +
> + /* Get the owner's ID for the directory. Ideally, we'd use RCU to
> + * access the parent rather than getting a ref.
> + */
> + parent = dget_parent(path->dentry);
> + dvnode = AFS_FS_I(d_backing_inode(parent));
> + owner = dvnode->status.owner;
> + dput(parent);
> +
> + if (vnode->status.owner == owner) {
> + ret = 0;
> + goto error;
> + }
> +
> + /* Get the access rights for the key on this file. */
> + ret = afs_check_permit(vnode, key, &access);
> + if (ret < 0)
> + goto error;
> +
> + /* We get the ADMINISTER bit if we own the file. */
> + ret = (access & AFS_ACE_ADMINISTER) ? 1 : 0;
> +error:
> + key_put(key);
> + return ret;
> +}
> +
> void __exit afs_clean_up_permit_cache(void)
> {
> int i;
> diff --git a/fs/namei.c b/fs/namei.c
> index 84a0e0b0111c..e52c91cbed2a 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1316,13 +1316,20 @@ static int may_create_in_sticky(struct mnt_idmap *idmap, struct nameidata *nd,
> if (S_ISFIFO(inode->i_mode) && !sysctl_protected_fifos)
> return 0;
>
> - i_vfsuid = i_uid_into_vfsuid(idmap, inode);
> + if (unlikely(inode->i_op->may_create_in_sticky)) {
> + int ret = inode->i_op->may_create_in_sticky(idmap, inode, &nd->path);
This should probably use an IOP flag just like we do for permission
handling.
>
> - if (vfsuid_eq(i_vfsuid, dir_vfsuid))
> - return 0;
> + if (ret <= 0) /* 1 if not owned by me or by parent dir. */
> + return ret;
> + } else {
> + i_vfsuid = i_uid_into_vfsuid(idmap, inode);
>
> - if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
> - return 0;
> + if (vfsuid_eq(i_vfsuid, dir_vfsuid))
> + return 0;
> +
> + if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
> + return 0;
> + }
>
> if (likely(dir_mode & 0002)) {
> audit_log_path_denied(AUDIT_ANOM_CREAT, "sticky_create");
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 016b0fe1536e..11122e169719 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2236,6 +2236,8 @@ struct inode_operations {
> struct dentry *dentry, struct fileattr *fa);
> int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);
> struct offset_ctx *(*get_offset_ctx)(struct inode *inode);
> + int (*may_create_in_sticky)(struct mnt_idmap *idmap, struct inode *inode,
> + struct path *path);
> } ____cacheline_aligned;
>
> static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-05-12 13:02 ` [PATCH v2] " David Howells
2025-05-13 7:49 ` Christian Brauner
@ 2025-05-13 8:30 ` David Howells
2025-05-13 15:44 ` Jeffrey E Altman
1 sibling, 1 reply; 16+ messages in thread
From: David Howells @ 2025-05-13 8:30 UTC (permalink / raw)
To: Christian Brauner
Cc: dhowells, Alexander Viro, Etienne Champetier, Marc Dionne,
Jeffrey Altman, Chet Ramey, Steve French, linux-afs,
openafs-devel, linux-cifs, linux-fsdevel, linux-kernel
Christian Brauner <brauner@kernel.org> wrote:
> There's a few other places where we compare vfsuids:
>
> * may_delete()
> -> check_sticky()
> -> __check_sticky()
>
> * may_follow_link()
>
> * may_linkat()
>
> * fsuidgid_has_mapping()
>
> Anyone of those need special treatment on AFS as well?
That's a good question. I think it might be better to switch back to the v1
patch - which gives me two separate ops and provide a couple of vfs wrappers
for them and use them more widely.
So, perhaps:
vfs_have_same_owner(inode1, inode2)
which indicates if the two inodes have the same ownership and:
vfs_is_owned_by_me(inode)
which compares the inode's ownership to current_fsuid() by default.
The following places need to be considered for being changed:
(*) chown_ok()
(*) chgrp_ok()
Should call vfs_is_owned_by_me(). Possibly these need to defer all their
checks to the network filesystem as the interpretation of the target
UID/GID depends on the netfs.
(*) do_coredump()
Should probably call vfs_is_owned_by_me() to check that the file created
is owned by the caller - but the check that's there might be sufficient.
(*) inode_owner_or_capable()
Should call vfs_is_owned_by_me(). I'm not sure whether the namespace
mapping makes sense in such a case, but it probably could be used.
(*) vfs_setlease()
Should call vfs_is_owned_by_me(). Actually, it should query if leasing
is permitted.
Also, setting locks could perhaps do with a permission call to the
filesystem driver as AFS, for example, has a lock permission bit in the
ACL, but since the AFS server checks that when the RPC call is made, it's
probably unnecessary.
(*) acl_permission_check()
(*) posix_acl_permission()
UIDs are part of these ACLs, so no change required. AFS implements its
own ACLs and evaluates them in ->permission() and on the server.
(*) may_follow_link()
Should call vfs_is_owned_by_me() and also vfs_have_same_owner() on the
the link and its parent dir. The latter only applies on world-writable
sticky dirs.
(*) may_create_in_sticky()
The initial subject of this patch. Should call vfs_is_owned_by_me() and
also vfs_have_same_owner() both.
(*) __check_sticky()
Should call vfs_is_owned_by_me() on both the dir and the inode.
(*) may_dedupe_file()
Should call vfs_is_owned_by_me().
(*) IMA policy ops.
No idea.
David
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-05-13 8:30 ` David Howells
@ 2025-05-13 15:44 ` Jeffrey E Altman
0 siblings, 0 replies; 16+ messages in thread
From: Jeffrey E Altman @ 2025-05-13 15:44 UTC (permalink / raw)
To: David Howells, Christian Brauner
Cc: Alexander Viro, Etienne Champetier, Marc Dionne, Chet Ramey,
Steve French, linux-afs, openafs-devel, linux-cifs, linux-fsdevel,
linux-kernel
[-- Attachment #1: Type: text/plain, Size: 5222 bytes --]
I performed a review of the usage of vfsuid_eq_kuid() and vfsuid_eq().
I mostly agree with David's conclusions and add some additional insight
into the behavior of AFS servers.
On 5/13/2025 4:30 AM, David Howells wrote:
> Christian Brauner <brauner@kernel.org> wrote:
>
>> There's a few other places where we compare vfsuids:
>>
>> * may_delete()
>> -> check_sticky()
>> -> __check_sticky()
>>
>> * may_follow_link()
>>
>> * may_linkat()
>>
>> * fsuidgid_has_mapping()
>>
>> Anyone of those need special treatment on AFS as well?
> That's a good question. I think it might be better to switch back to the v1
> patch - which gives me two separate ops and provide a couple of vfs wrappers
> for them and use them more widely.
>
> So, perhaps:
>
> vfs_have_same_owner(inode1, inode2)
>
> which indicates if the two inodes have the same ownership and:
>
> vfs_is_owned_by_me(inode)
>
> which compares the inode's ownership to current_fsuid() by default.
The use of two distinct inode operations make the most sense to me. An
alternative is to provide one inode operation which sets two boolean
output parameters:
int (*check_ownership)(struct inode *const inode, struct inode *const
parent,int *is_owned_by_me, int *is_owned_by_parent); where
'is_owned_by_me' or 'is_owned_by_parent' might be NULL if the answer is
not required. However, I prefer David's suggestion.
> The following places need to be considered for being changed:
>
> (*) chown_ok()
> (*) chgrp_ok()
>
> Should call vfs_is_owned_by_me(). Possibly these need to defer all their
> checks to the network filesystem as the interpretation of the target
> UID/GID depends on the netfs.
Since the late 1980s, afs servers do not permit changes to owner or
group on files unless the caller is a member of the
system:administrators group. The file system clients cannot make this
determination themselves. If Linux wishes to further restrict the
operation to current owner, then use of a vfs_is_owned_by_me() like
inode operation should be used.
Something to consider for future AFS3 or YFS protocol changes is to
report the right to chown|chgrp to the client as part of a the
FetchStatus result set.
> (*) do_coredump()
>
> Should probably call vfs_is_owned_by_me() to check that the file created
> is owned by the caller - but the check that's there might be sufficient.
I agree.
> (*) inode_owner_or_capable()
>
> Should call vfs_is_owned_by_me().
I agree.
> I'm not sure whether the namespace
> mapping makes sense in such a case, but it probably could be used.
>
> (*) vfs_setlease()
>
> Should call vfs_is_owned_by_me(). Actually, it should query if leasing
> is permitted.
>
> Also, setting locks could perhaps do with a permission call to the
> filesystem driver as AFS, for example, has a lock permission bit in the
> ACL, but since the AFS server checks that when the RPC call is made, it's
> probably unnecessary.
The AFS server will grant locks based upon the following rules:
* the caller is granted the PRSFS_LOCK right (Shared lock only)
* the caller is granted the PRSFS_WRITE right (Shared or Exclusive lock)
* the caller is the file owner and is granted the PRSFS_INSERT right
(Shared or Exclusive lock)
The client has enough information to implement a lock permission check
if there was such an inode operation.
> (*) acl_permission_check()
> (*) posix_acl_permission()
>
> UIDs are part of these ACLs, so no change required. AFS implements its
> own ACLs and evaluates them in ->permission() and on the server.
acl_permission_check() and posix_acl_permission() will not be called for
AFS. However, it it probably worth adding the vfs_is_owned_by_me() to
acl_permission_check() in case there is another network filesystem which
requires non-uid ownership checks and wants to use generic_permission().
> (*) may_follow_link()
>
> Should call vfs_is_owned_by_me() and also vfs_have_same_owner() on the
> the link and its parent dir. The latter only applies on world-writable
> sticky dirs.
I agree
> (*) may_create_in_sticky()
>
> The initial subject of this patch. Should call vfs_is_owned_by_me() and
> also vfs_have_same_owner() both.
I agree.
> (*) __check_sticky()
>
> Should call vfs_is_owned_by_me() on both the dir and the inode.
I agree.
> (*) may_dedupe_file()
>
> Should call vfs_is_owned_by_me().
I agree.
>
> (*) IMA policy ops.
>
> No idea.
I am not familiar with the Integrity Measurement Operations. However,
looking at the usage of the ima_rule_entry fowner_op and fgroup_op
operations, I do not believe the proposed vfs_is_owned_by_me() could be
used to implement fowner_op. If IMA should work filesystems which
cannot rely upon local uid comparisons for owner and group, then I think
the IMA fowner_op and fgroup_op would require an alternative
implementation. At the moment, IMA is unlikely to work properly with AFS.
> David
Jeffrey Altman
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4276 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-05-05 13:14 ` Christian Brauner
[not found] ` <CAOdf3grbDQ-Fh2bV7XfoYvVBhgBAh7-hZyyxTNt1RfGekrA-nA@mail.gmail.com>
@ 2025-05-14 12:49 ` Chet Ramey
1 sibling, 0 replies; 16+ messages in thread
From: Chet Ramey @ 2025-05-14 12:49 UTC (permalink / raw)
To: Christian Brauner, David Howells
Cc: chet.ramey, Alexander Viro, Etienne Champetier, Marc Dionne,
Jeffrey Altman, Steve French, linux-afs, openafs-devel,
linux-cifs, linux-fsdevel, linux-kernel
[-- Attachment #1.1: Type: text/plain, Size: 1043 bytes --]
On 5/5/25 9:14 AM, Christian Brauner wrote:
>> This works around the kernel not being able to validly check the
>> current_fsuid() against i_uid on the file or the directory because the
>> uidspaces of the system and of AFS may well be disjoint. The problem lies
>> with the uid checks in may_create_in_sticky().
>>
>> However, the bash work around is going to be removed:
>
> Why is it removed? That's a very strange comment:
I think this question has been adequately answered.
> So then just don't remove it. I don't see a reason for us to workaround
> userspace creating a bug for itself and forcing us to add two new inode
> operations to work around it.
I think this shows that userspace applications should be very cautious
about putting in workarounds for kernel bugs, and making them as limited
in scope as possible.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir
2025-05-05 14:42 ` Jeffrey E Altman
@ 2025-05-14 12:50 ` Chet Ramey
0 siblings, 0 replies; 16+ messages in thread
From: Chet Ramey @ 2025-05-14 12:50 UTC (permalink / raw)
To: Jeffrey E Altman, Etienne Champetier, Christian Brauner
Cc: chet.ramey, David Howells, Alexander Viro, Marc Dionne,
Steve French, linux-fsdevel, Linux AFS mailing list, linux-kernel,
linux-cifs, openafs-devel@openafs.org
[-- Attachment #1.1: Type: text/plain, Size: 702 bytes --]
On 5/5/25 10:42 AM, Jeffrey E Altman wrote:
>>> So then just don't remove it. I don't see a reason for us to workaround
>>> userspace creating a bug for itself and forcing us to add two new inode
>>> operations to work around it.
>> This bash workaround introduced ages ago for AFS bypass fs.protected_regular
>
> Chet, I don't think this history is correct.
I think Etienne's terse comment accurately summarizes the current problem
(and maybe it would read more clearly if he had used `bypasses').
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-05-14 12:56 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-29 16:37 [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir David Howells
2025-04-29 17:35 ` Jeffrey E Altman
2025-04-30 15:09 ` Chet Ramey
2025-04-30 16:14 ` David Howells
2025-04-30 17:26 ` Jeffrey E Altman
2025-04-30 18:36 ` Chet Ramey
2025-05-05 13:14 ` Christian Brauner
[not found] ` <CAOdf3grbDQ-Fh2bV7XfoYvVBhgBAh7-hZyyxTNt1RfGekrA-nA@mail.gmail.com>
2025-05-05 14:42 ` Jeffrey E Altman
2025-05-14 12:50 ` Chet Ramey
2025-05-14 12:49 ` Chet Ramey
2025-05-06 10:26 ` David Howells
2025-05-09 10:33 ` Christian Brauner
2025-05-12 13:02 ` [PATCH v2] " David Howells
2025-05-13 7:49 ` Christian Brauner
2025-05-13 8:30 ` David Howells
2025-05-13 15:44 ` Jeffrey E Altman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).