* [PATCH 2/2] smb/client: fix security flag calculation when setting security descriptors
From: Steve French @ 2026-06-24 15:51 UTC (permalink / raw)
To: linux-cifs; +Cc: Ralph Boehme, Steve French
In-Reply-To: <20260624155124.4121786-1-stfrench@microsoft.com>
From: Ralph Boehme <slow@samba.org>
In id_mode_to_cifs_acl(), aclflag was initialized to CIFS_ACL_DACL by default.
This forced the client to request setting the DACL even when only an ownership
(chown) or group (chgrp) change was being performed.
Let build_sec_desc() do the proper flag calculation by initializing aclflag
to 0. build_sec_desc() sets the appropriate bits (CIFS_ACL_OWNER, CIFS_ACL_GROUP,
or CIFS_ACL_DACL) depending on what actually changed. During ownership transfer,
CIFS_ACL_DACL is only set if replace_sids_and_copy_aces() actually replaces the
SIDs inside any of the DACL's ACEs.
If build_sec_desc() results in aclflag being 0 (meaning no changes were mapped),
exit early to avoid sending an empty security descriptor update to the server.
Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
---
fs/smb/client/cifsacl.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c
index 5bbf73736358..07cf0e578233 100644
--- a/fs/smb/client/cifsacl.c
+++ b/fs/smb/client/cifsacl.c
@@ -1185,7 +1185,8 @@ static void populate_new_aces(char *nacl_base,
static __u16 replace_sids_and_copy_aces(struct smb_acl *pdacl, struct smb_acl *pndacl,
struct smb_sid *pownersid, struct smb_sid *pgrpsid,
- struct smb_sid *pnownersid, struct smb_sid *pngrpsid)
+ struct smb_sid *pnownersid, struct smb_sid *pngrpsid,
+ int *aclflag)
{
int i;
u16 size = 0;
@@ -1209,12 +1210,15 @@ static __u16 replace_sids_and_copy_aces(struct smb_acl *pdacl, struct smb_acl *p
pntace = (struct smb_ace *) (acl_base + size);
pnntace = (struct smb_ace *) (nacl_base + nsize);
- if (pnownersid && compare_sids(&pntace->sid, pownersid) == 0)
+ if (pnownersid && compare_sids(&pntace->sid, pownersid) == 0) {
ace_size = cifs_copy_ace(pnntace, pntace, pnownersid);
- else if (pngrpsid && compare_sids(&pntace->sid, pgrpsid) == 0)
+ *aclflag |= CIFS_ACL_DACL;
+ } else if (pngrpsid && compare_sids(&pntace->sid, pgrpsid) == 0) {
ace_size = cifs_copy_ace(pnntace, pntace, pngrpsid);
- else
+ *aclflag |= CIFS_ACL_DACL;
+ } else {
ace_size = cifs_copy_ace(pnntace, pntace, NULL);
+ }
size += le16_to_cpu(pntace->size);
nsize += ace_size;
@@ -1521,7 +1525,8 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd,
/* Replace ACEs for old owner with new one */
size = replace_sids_and_copy_aces(dacl_ptr, ndacl_ptr,
owner_sid_ptr, group_sid_ptr,
- nowner_sid_ptr, ngroup_sid_ptr);
+ nowner_sid_ptr, ngroup_sid_ptr,
+ aclflag);
ndacl_ptr->size = cpu_to_le16(size);
}
@@ -1738,7 +1743,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode,
kuid_t uid, kgid_t gid)
{
int rc = 0;
- int aclflag = CIFS_ACL_DACL; /* default flag to set */
+ int aclflag = 0;
__u32 secdesclen = 0;
__u32 nsecdesclen = 0;
__u32 dacloffset = 0;
@@ -1837,6 +1842,11 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode,
if (rc != 0)
goto id_mode_to_cifs_acl_exit;
+ if (aclflag == 0) {
+ cifs_dbg(FYI, "set_cifs_acl aclflag=0, no change mapped\n");
+ goto id_mode_to_cifs_acl_exit;
+ }
+
if (ops->set_acl == NULL) {
rc = -EOPNOTSUPP;
goto id_mode_to_cifs_acl_exit;
--
2.53.0
^ permalink raw reply related
* [PATCH 1/2] smb: client: refactor ACL setting control flow in id_mode_to_cifs_acl()
From: Steve French @ 2026-06-24 15:51 UTC (permalink / raw)
To: linux-cifs; +Cc: Ralph Boehme, Steve French
From: Ralph Boehme <slow@samba.org>
Refactor the control flow in id_mode_to_cifs_acl() to reduce nesting and
prevent error code overwriting.
Instead of wrapping the call to ops->set_acl() in a conditional block,
introduce early exits (goto id_mode_to_cifs_acl_exit) when build_sec_desc()
fails or ops->set_acl is NULL. This ensures that any actual error returned
by build_sec_desc() is not overwritten with -EOPNOTSUPP.
Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
---
fs/smb/client/cifsacl.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c
index 42a3115359da..5bbf73736358 100644
--- a/fs/smb/client/cifsacl.c
+++ b/fs/smb/client/cifsacl.c
@@ -1834,14 +1834,18 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode,
cifs_dbg(NOISY, "build_sec_desc rc: %d\n", rc);
- if (ops->set_acl == NULL)
- rc = -EOPNOTSUPP;
+ if (rc != 0)
+ goto id_mode_to_cifs_acl_exit;
- if (!rc) {
- /* Set the security descriptor */
- rc = ops->set_acl(pnntsd, nsecdesclen, inode, path, aclflag);
- cifs_dbg(NOISY, "set_cifs_acl rc: %d\n", rc);
+ if (ops->set_acl == NULL) {
+ rc = -EOPNOTSUPP;
+ goto id_mode_to_cifs_acl_exit;
}
+
+ /* Set the security descriptor */
+ rc = ops->set_acl(pnntsd, nsecdesclen, inode, path, aclflag);
+ cifs_dbg(NOISY, "set_cifs_acl rc: %d\n", rc);
+
id_mode_to_cifs_acl_exit:
cifs_put_tlink(tlink);
--
2.53.0
^ permalink raw reply related
* [PATCH] smb/client: fix security flag calculation when setting security descriptors
From: Steve French @ 2026-06-24 15:40 UTC (permalink / raw)
To: linux-cifs; +Cc: Ralph Boehme
From: Ralph Boehme <slow@samba.org>
In id_mode_to_cifs_acl(), aclflag was initialized to CIFS_ACL_DACL by default.
This forced the client to request setting the DACL even when only an ownership
(chown) or group (chgrp) change was being performed.
Let build_sec_desc() do the proper flag calculation by initializing aclflag
to 0. build_sec_desc() sets the appropriate bits (CIFS_ACL_OWNER, CIFS_ACL_GROUP,
or CIFS_ACL_DACL) depending on what actually changed. During ownership transfer,
CIFS_ACL_DACL is only set if replace_sids_and_copy_aces() actually replaces the
SIDs inside any of the DACL's ACEs.
If build_sec_desc() results in aclflag being 0 (meaning no changes were mapped),
exit early to avoid sending an empty security descriptor update to the server.
Signed-off-by: Ralph Boehme <slow@samba.org>
---
fs/smb/client/cifsacl.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/fs/smb/client/cifsacl.c b/fs/smb/client/cifsacl.c
index d90b3ac45ce4..0856c7da2054 100644
--- a/fs/smb/client/cifsacl.c
+++ b/fs/smb/client/cifsacl.c
@@ -1112,7 +1112,8 @@ static void populate_new_aces(char *nacl_base,
static __u16 replace_sids_and_copy_aces(struct smb_acl *pdacl, struct smb_acl *pndacl,
struct smb_sid *pownersid, struct smb_sid *pgrpsid,
- struct smb_sid *pnownersid, struct smb_sid *pngrpsid)
+ struct smb_sid *pnownersid, struct smb_sid *pngrpsid,
+ int *aclflag)
{
int i;
u16 size = 0;
@@ -1136,12 +1137,15 @@ static __u16 replace_sids_and_copy_aces(struct smb_acl *pdacl, struct smb_acl *p
pntace = (struct smb_ace *) (acl_base + size);
pnntace = (struct smb_ace *) (nacl_base + nsize);
- if (pnownersid && compare_sids(&pntace->sid, pownersid) == 0)
+ if (pnownersid && compare_sids(&pntace->sid, pownersid) == 0) {
ace_size = cifs_copy_ace(pnntace, pntace, pnownersid);
- else if (pngrpsid && compare_sids(&pntace->sid, pgrpsid) == 0)
+ *aclflag |= CIFS_ACL_DACL;
+ } else if (pngrpsid && compare_sids(&pntace->sid, pgrpsid) == 0) {
ace_size = cifs_copy_ace(pnntace, pntace, pngrpsid);
- else
+ *aclflag |= CIFS_ACL_DACL;
+ } else {
ace_size = cifs_copy_ace(pnntace, pntace, NULL);
+ }
size += le16_to_cpu(pntace->size);
nsize += ace_size;
@@ -1461,7 +1465,8 @@ static int build_sec_desc(struct smb_ntsd *pntsd, struct smb_ntsd *pnntsd,
/* Replace ACEs for old owner with new one */
size = replace_sids_and_copy_aces(dacl_ptr, ndacl_ptr,
owner_sid_ptr, group_sid_ptr,
- nowner_sid_ptr, ngroup_sid_ptr);
+ nowner_sid_ptr, ngroup_sid_ptr,
+ aclflag);
ndacl_ptr->size = cpu_to_le16(size);
}
@@ -1676,7 +1681,7 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode,
kuid_t uid, kgid_t gid)
{
int rc = 0;
- int aclflag = CIFS_ACL_DACL; /* default flag to set */
+ int aclflag = 0;
__u32 secdesclen = 0;
__u32 nsecdesclen = 0;
__u32 dacloffset = 0;
@@ -1775,6 +1780,11 @@ id_mode_to_cifs_acl(struct inode *inode, const char *path, __u64 *pnmode,
if (rc != 0)
goto id_mode_to_cifs_acl_exit;
+ if (aclflag == 0) {
+ cifs_dbg(FYI, "set_cifs_acl aclflag=0, no change mapped\n");
+ goto id_mode_to_cifs_acl_exit;
+ }
+
if (ops->set_acl == NULL) {
rc = -EOPNOTSUPP;
goto id_mode_to_cifs_acl_exit;
--
2.54.0
^ permalink raw reply related
* Re: [PATCH v2 14/14] netfs: Fix DIO write retry for filesystems without a ->prepare_write()
From: ChenXiaoSong @ 2026-06-24 14:21 UTC (permalink / raw)
To: David Howells, Christian Brauner
Cc: Paulo Alcantara, netfs, linux-afs, linux-cifs, ceph-devel,
linux-fsdevel, linux-kernel, syzbot+3c74b1f0c372e98efc32, hongao,
Steve French, Namjae Jeon
In-Reply-To: <20260624115737.2964520-15-dhowells@redhat.com>
Looks good. Feel free to add:
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Tested-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
在 2026/6/24 19:57, David Howells 写道:
> Fix netfs_unbuffered_write() so that it doesn't re-issue a write twice when
> the filesystem doesn't have a ->prepare_write(). The bit of code that does
> the first issue should just be removed as everything it does is done again
> when the loop it's in goes back to the top.
>
> Reported-by: syzbot+3c74b1f0c372e98efc32@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=3c74b1f0c372e98efc32
> Tested-by: syzbot+3c74b1f0c372e98efc32@syzkaller.appspotmail.com
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Paulo Alcantara <pc@manguebit.org>
> cc: hongao <hongao@uniontech.com>
> cc: ChenXiaoSong <chenxiaosong@chenxiaosong.com>
> cc: netfs@lists.linux.dev
> cc: linux-fsdevel@vger.kernel.org
> ---
> fs/netfs/direct_write.c | 6 ------
> 1 file changed, 6 deletions(-)
>
> diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
> index 25f8ceb15fad..1ff1789016b8 100644
> --- a/fs/netfs/direct_write.c
> +++ b/fs/netfs/direct_write.c
> @@ -190,12 +190,6 @@ static int netfs_unbuffered_write(struct netfs_io_request *wreq)
> stream->prepare_write(subreq);
> __set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags);
> netfs_stat(&netfs_n_wh_retry_write_subreq);
> - } else {
> - struct iov_iter source;
> -
> - netfs_reset_iter(subreq);
> - source = subreq->io_iter;
> - netfs_reissue_write(stream, subreq, &source);
> }
> }
>
>
--
ChenXiaoSong <chenxiaosong@chenxiaosong.com>
Chinese Homepage: https://chenxiaosong.com
English Homepage: https://chenxiaosong.com/en
^ permalink raw reply
* Re: [PATCH v2 04/14] iov_iter: Fix potential underflow in iov_iter_extract_xarray_pages()
From: Christoph Hellwig @ 2026-06-24 13:41 UTC (permalink / raw)
To: David Howells
Cc: Christian Brauner, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Matthew Wilcox,
Christoph Hellwig, Jens Axboe, Mike Marshall
In-Reply-To: <20260624115737.2964520-5-dhowells@redhat.com>
FYI, I'm getting 5 patches out of apparently 14 patch series and no
cover letter the second time now. No idea who I'm supposed to review
this.
^ permalink raw reply
* [PATCH v2 14/14] netfs: Fix DIO write retry for filesystems without a ->prepare_write()
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel,
syzbot+3c74b1f0c372e98efc32, hongao, ChenXiaoSong
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix netfs_unbuffered_write() so that it doesn't re-issue a write twice when
the filesystem doesn't have a ->prepare_write(). The bit of code that does
the first issue should just be removed as everything it does is done again
when the loop it's in goes back to the top.
Reported-by: syzbot+3c74b1f0c372e98efc32@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=3c74b1f0c372e98efc32
Tested-by: syzbot+3c74b1f0c372e98efc32@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: hongao <hongao@uniontech.com>
cc: ChenXiaoSong <chenxiaosong@chenxiaosong.com>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
fs/netfs/direct_write.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index 25f8ceb15fad..1ff1789016b8 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -190,12 +190,6 @@ static int netfs_unbuffered_write(struct netfs_io_request *wreq)
stream->prepare_write(subreq);
__set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags);
netfs_stat(&netfs_n_wh_retry_write_subreq);
- } else {
- struct iov_iter source;
-
- netfs_reset_iter(subreq);
- source = subreq->io_iter;
- netfs_reissue_write(stream, subreq, &source);
}
}
^ permalink raw reply related
* [PATCH v2 13/14] netfs: Fix folio state after ENOMEM whilst under writeback iteration
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Matthew Wilcox
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix the state of the current folio when ENOMEM occurs during writeback
iteration. The folio needs to be redirtied and unlocked before the
terminal writeback_iter() is invoked.
Fixes: 06fa229ceb36 ("netfs: Abstract out a rolling folio buffer implementation")
Link: https://sashiko.dev/#/patchset/20260619140646.2633762-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
fs/netfs/write_issue.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index d1f98cca5d0f..6eb6cff12c17 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -334,8 +334,11 @@ static int netfs_write_folio(struct netfs_io_request *wreq,
_enter("");
- if (rolling_buffer_make_space(&wreq->buffer) < 0)
+ if (rolling_buffer_make_space(&wreq->buffer) < 0) {
+ folio_redirty_for_writepage(wbc, folio);
+ folio_unlock(folio);
return -ENOMEM;
+ }
/* netfs_perform_write() may shift i_size around the page or from out
* of the page to beyond it, but cannot move i_size into or through the
^ permalink raw reply related
* [PATCH v2 12/14] netfs: Fix writeback error handling
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Matthew Wilcox
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix the error handling in writeback_iter() loop. If an error occurs,
writeback_iter() needs to be called again with *error set to the error so
that it can clean up iteration state. Further, the current folio needs
unlocking and redirtying.
Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Link: https://sashiko.dev/#/patchset/20260619140646.2633762-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
fs/netfs/write_issue.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index e5d4f9ba3d6b..d1f98cca5d0f 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -582,8 +582,6 @@ int netfs_writepages(struct address_space *mapping,
}
error = netfs_write_folio(wreq, wbc, folio);
- if (error < 0)
- break;
} while ((folio = writeback_iter(mapping, wbc, folio, &error)));
netfs_end_issue_write(wreq);
@@ -594,6 +592,9 @@ int netfs_writepages(struct address_space *mapping,
return error;
couldnt_start:
+ folio_redirty_for_writepage(wbc, folio);
+ folio_unlock(folio);
+ writeback_iter(mapping, wbc, folio, &error);
netfs_kill_dirty_pages(mapping, wbc, folio);
out:
netfs_wb_end(ictx);
^ permalink raw reply related
* [PATCH v2 11/14] netfs: Fix writethrough to use collection offload
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix writethrough write to set NETFS_RREQ_OFFLOAD_COLLECTION on the request
so that collection is processed asynchronously rather than only right at
the end - and also so that asynchronous O_SYNC writes get collected at all.
Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Closes: https://sashiko.dev/#/patchset/20260616100821.2062304-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
fs/netfs/write_issue.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index 437bb50ce175..e5d4f9ba3d6b 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -620,6 +620,7 @@ struct netfs_io_request *netfs_begin_writethrough(struct kiocb *iocb, size_t len
}
wreq->io_streams[0].avail = true;
+ __set_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &wreq->flags);
trace_netfs_write(wreq, netfs_write_trace_writethrough);
return wreq;
}
^ permalink raw reply related
* [PATCH v2 10/14] netfs: Replace wb_lock with a bit lock for asynchronicity
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
The netfs_inode::wb_lock mutex is used to prevent multiple simultaneous
writebacks from fighting each other (a writeback thread will write multiple
discontiguous regions within the same request). The mutex, however, only
serialises the issuing of subrequests; it doesn't serialise the collection
of results, and, in particular, the updating of file size information and
fscache populatedness data.
Unfortunately, the mutex cannot be held around the entire process as it has
to be unlocked in the same thread in which it is locked - and we don't want
to hold up the allocator whilst we complete the writeback.
Fix this by replacing the mutex with a bit flag and a list of lock waiters
so that the lock can be dropped in the collector thread after collection is
complete.
Link: https://sashiko.dev/#/patchset/20260608145432.681865-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
fs/afs/symlink.c | 4 +-
fs/netfs/locking.c | 95 ++++++++++++++++++++++++++++++++++++++++
fs/netfs/write_collect.c | 10 +++++
fs/netfs/write_issue.c | 37 +++++-----------
include/linux/netfs.h | 11 ++++-
5 files changed, 126 insertions(+), 31 deletions(-)
diff --git a/fs/afs/symlink.c b/fs/afs/symlink.c
index ed5868369f37..16b4823cb7b7 100644
--- a/fs/afs/symlink.c
+++ b/fs/afs/symlink.c
@@ -255,11 +255,11 @@ int afs_symlink_writepages(struct address_space *mapping,
}
if (ret == 0) {
- mutex_lock(&vnode->netfs.wb_lock);
+ netfs_wb_begin(&vnode->netfs, false);
netfs_free_folioq_buffer(vnode->directory);
vnode->directory = NULL;
vnode->directory_size = 0;
- mutex_unlock(&vnode->netfs.wb_lock);
+ netfs_wb_end(&vnode->netfs);
} else if (ret == 1) {
ret = 0; /* Skipped write due to lock conflict. */
}
diff --git a/fs/netfs/locking.c b/fs/netfs/locking.c
index 2249ecd09d0a..4e3be2b81504 100644
--- a/fs/netfs/locking.c
+++ b/fs/netfs/locking.c
@@ -9,6 +9,11 @@
#include <linux/netfs.h>
#include "internal.h"
+struct netfs_wb_waiter {
+ struct list_head link; /* Link in ictx->wb_queue */
+ struct task_struct *waiter; /* Waiter task; cleared when lock granted */
+};
+
/*
* inode_dio_wait_interruptible - wait for outstanding DIO requests to finish
* @inode: inode to wait for
@@ -203,3 +208,93 @@ void netfs_end_io_direct(struct inode *inode)
up_read(&inode->i_rwsem);
}
EXPORT_SYMBOL(netfs_end_io_direct);
+
+/*
+ * Wait to have exclusive access to writeback.
+ */
+static bool netfs_wb_begin_wait(struct netfs_inode *ictx)
+{
+ struct netfs_wb_waiter waiter = {};
+ struct task_struct *tsk = current;
+ bool got = false;
+
+ spin_lock(&ictx->lock);
+
+ if (test_and_set_bit_lock(NETFS_ICTX_WB_LOCK, &ictx->flags)) {
+ get_task_struct(tsk);
+ waiter.waiter = tsk;
+ list_add_tail(&waiter.link, &ictx->wb_queue);
+ } else {
+ got = true;
+ }
+ spin_unlock(&ictx->lock);
+
+ if (!got) {
+ for (;;) {
+ set_current_state(TASK_UNINTERRUPTIBLE);
+ /* Read waiter before accessing inode state. */
+ if (smp_load_acquire(&waiter.waiter) == NULL)
+ break;
+ schedule();
+ }
+ }
+ __set_current_state(TASK_RUNNING);
+ return true;
+}
+
+/**
+ * netfs_wb_begin - Begin writeback, waiting if need be
+ * @ictx: The inode to get writeback access on
+ * @nowait: Return failure immediately rather than waiting if true
+ *
+ * Begin writeback to an inode, waiting for exclusive access if @nowait is
+ * false. This prevents collection from being done out of order with respect
+ * to the issuance of write subrequests.
+ *
+ * Note that writeback may be ended in a different process (e.g. the collection
+ * function on a workqueue) than started it.
+ *
+ * Return: True if can proceed, false if denied.
+ */
+bool netfs_wb_begin(struct netfs_inode *ictx, bool nowait)
+{
+ if (!test_and_set_bit_lock(NETFS_ICTX_WB_LOCK, &ictx->flags))
+ return true;
+ if (nowait) {
+ netfs_stat(&netfs_n_wb_lock_skip);
+ return false;
+ }
+ netfs_stat(&netfs_n_wb_lock_wait);
+ return netfs_wb_begin_wait(ictx);
+}
+EXPORT_SYMBOL(netfs_wb_begin);
+
+/* netfs_wb_end - End writeback
+ * @ictx: The inode we have writeback access to
+ *
+ * End writeback access on an inode, waking up the next writeback request.
+ */
+void netfs_wb_end(struct netfs_inode *ictx)
+{
+ struct netfs_wb_waiter *waiter;
+ struct task_struct *tsk;
+
+ WARN_ON_ONCE(!test_bit(NETFS_ICTX_WB_LOCK, &ictx->flags));
+
+ spin_lock(&ictx->lock);
+
+ waiter = list_first_entry_or_null(&ictx->wb_queue, struct netfs_wb_waiter, link);
+ if (waiter) {
+ list_del(&waiter->link);
+ tsk = waiter->waiter;
+ /* Write inode state before clearing waiter. */
+ smp_store_release(&waiter->waiter, NULL);
+ wake_up_process(tsk);
+ put_task_struct(tsk);
+ } else {
+ clear_bit_unlock(NETFS_ICTX_WB_LOCK, &ictx->flags);
+ }
+
+ spin_unlock(&ictx->lock);
+}
+EXPORT_SYMBOL(netfs_wb_end);
diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c
index 24fc2bb2f8a4..210eb8f3958d 100644
--- a/fs/netfs/write_collect.c
+++ b/fs/netfs/write_collect.c
@@ -408,6 +408,16 @@ bool netfs_write_collection(struct netfs_io_request *wreq)
netfs_wake_rreq_flag(wreq, NETFS_RREQ_IN_PROGRESS, netfs_rreq_trace_wake_ip);
/* As we cleared NETFS_RREQ_IN_PROGRESS, we acquired its ref. */
+ switch (wreq->origin) {
+ case NETFS_WRITEBACK:
+ case NETFS_WRITEBACK_SINGLE:
+ case NETFS_WRITETHROUGH:
+ netfs_wb_end(ictx);
+ break;
+ default:
+ break;
+ }
+
if (wreq->iocb) {
size_t written = min(wreq->transferred, wreq->len);
wreq->iocb->ki_pos += written;
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index c03c7cc45e47..437bb50ce175 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -551,14 +551,8 @@ int netfs_writepages(struct address_space *mapping,
struct folio *folio;
int error = 0;
- if (!mutex_trylock(&ictx->wb_lock)) {
- if (wbc->sync_mode == WB_SYNC_NONE) {
- netfs_stat(&netfs_n_wb_lock_skip);
- return 0;
- }
- netfs_stat(&netfs_n_wb_lock_wait);
- mutex_lock(&ictx->wb_lock);
- }
+ if (!netfs_wb_begin(ictx, wbc->sync_mode == WB_SYNC_NONE))
+ return 0;
/* Need the first folio to be able to set up the op. */
folio = writeback_iter(mapping, wbc, NULL, &error);
@@ -593,8 +587,6 @@ int netfs_writepages(struct address_space *mapping,
} while ((folio = writeback_iter(mapping, wbc, folio, &error)));
netfs_end_issue_write(wreq);
-
- mutex_unlock(&ictx->wb_lock);
netfs_wake_collector(wreq);
netfs_put_request(wreq, netfs_rreq_trace_put_return);
@@ -604,7 +596,7 @@ int netfs_writepages(struct address_space *mapping,
couldnt_start:
netfs_kill_dirty_pages(mapping, wbc, folio);
out:
- mutex_unlock(&ictx->wb_lock);
+ netfs_wb_end(ictx);
_leave(" = %d", error);
return error;
}
@@ -618,12 +610,12 @@ struct netfs_io_request *netfs_begin_writethrough(struct kiocb *iocb, size_t len
struct netfs_io_request *wreq = NULL;
struct netfs_inode *ictx = netfs_inode(file_inode(iocb->ki_filp));
- mutex_lock(&ictx->wb_lock);
+ netfs_wb_begin(ictx, false);
wreq = netfs_create_write_req(iocb->ki_filp->f_mapping, iocb->ki_filp,
iocb->ki_pos, NETFS_WRITETHROUGH);
if (IS_ERR(wreq)) {
- mutex_unlock(&ictx->wb_lock);
+ netfs_wb_end(ictx);
return wreq;
}
@@ -685,7 +677,6 @@ int netfs_advance_writethrough(struct netfs_io_request *wreq, struct writeback_c
ssize_t netfs_end_writethrough(struct netfs_io_request *wreq, struct writeback_control *wbc,
struct folio *writethrough_cache)
{
- struct netfs_inode *ictx = netfs_inode(wreq->inode);
ssize_t ret;
_enter("R=%x", wreq->debug_id);
@@ -699,8 +690,6 @@ ssize_t netfs_end_writethrough(struct netfs_io_request *wreq, struct writeback_c
netfs_end_issue_write(wreq);
- mutex_unlock(&ictx->wb_lock);
-
if (wreq->iocb)
ret = -EIOCBQUEUED;
else
@@ -847,15 +836,10 @@ int netfs_writeback_single(struct address_space *mapping,
if (WARN_ON_ONCE(!iov_iter_is_folioq(iter)))
return -EIO;
- if (!mutex_trylock(&ictx->wb_lock)) {
- if (wbc->sync_mode == WB_SYNC_NONE) {
- /* The VFS will have undirtied the inode. */
- netfs_single_mark_inode_dirty(&ictx->inode);
- netfs_stat(&netfs_n_wb_lock_skip);
- return 1;
- }
- netfs_stat(&netfs_n_wb_lock_wait);
- mutex_lock(&ictx->wb_lock);
+ if (!netfs_wb_begin(ictx, wbc->sync_mode == WB_SYNC_NONE)) {
+ /* The VFS will have undirtied the inode. */
+ netfs_single_mark_inode_dirty(&ictx->inode);
+ return 1;
}
wreq = netfs_create_write_req(mapping, NULL, 0, NETFS_WRITEBACK_SINGLE);
@@ -893,7 +877,6 @@ int netfs_writeback_single(struct address_space *mapping,
smp_wmb(); /* Write lists before ALL_QUEUED. */
set_bit(NETFS_RREQ_ALL_QUEUED, &wreq->flags);
- mutex_unlock(&ictx->wb_lock);
netfs_wake_collector(wreq);
netfs_put_request(wreq, netfs_rreq_trace_put_return);
@@ -901,7 +884,7 @@ int netfs_writeback_single(struct address_space *mapping,
return ret;
couldnt_start:
- mutex_unlock(&ictx->wb_lock);
+ netfs_wb_end(ictx);
_leave(" = %d", ret);
return ret;
}
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index bdc270e84b30..1bc120d61c5b 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -61,14 +61,16 @@ struct netfs_inode {
#if IS_ENABLED(CONFIG_FSCACHE)
struct fscache_cookie *cache;
#endif
- struct mutex wb_lock; /* Writeback serialisation */
+ struct list_head wb_queue; /* Queue of processes wanting to do writeback */
loff_t _remote_i_size; /* Size of the remote file */
loff_t _zero_point; /* Size after which we assume there's no data
* on the server */
+ spinlock_t lock; /* Lock covering wb_queue */
atomic_t io_count; /* Number of outstanding reqs */
unsigned long flags;
#define NETFS_ICTX_ODIRECT 0 /* The file has DIO in progress */
#define NETFS_ICTX_UNBUFFERED 1 /* I/O should not use the pagecache */
+#define NETFS_ICTX_WB_LOCK 2 /* Writeback serialisation lock */
#define NETFS_ICTX_MODIFIED_ATTR 3 /* Indicate change in mtime/ctime */
#define NETFS_ICTX_SINGLE_NO_UPLOAD 4 /* Monolithic payload, cache but no upload */
};
@@ -462,6 +464,10 @@ int netfs_alloc_folioq_buffer(struct address_space *mapping,
size_t *_cur_size, ssize_t size, gfp_t gfp);
void netfs_free_folioq_buffer(struct folio_queue *fq);
+/* Writeback exclusion API. */
+bool netfs_wb_begin(struct netfs_inode *ictx, bool nowait);
+void netfs_wb_end(struct netfs_inode *ictx);
+
/**
* netfs_inode - Get the netfs inode context from the inode
* @inode: The inode to query
@@ -743,7 +749,8 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
#if IS_ENABLED(CONFIG_FSCACHE)
ctx->cache = NULL;
#endif
- mutex_init(&ctx->wb_lock);
+ INIT_LIST_HEAD(&ctx->wb_queue);
+ spin_lock_init(&ctx->lock);
/* ->releasepage() drives zero_point */
if (use_zero_point) {
ctx->_zero_point = ctx->_remote_i_size;
^ permalink raw reply related
* [PATCH v2 09/14] netfs: Fix kdoc warning
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Matthew Wilcox
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix a kdoc warning due to a misnamed parameter in the description.
Reported-by: Matthew Wilcox <willy@infradead.org>
cc: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
include/linux/netfs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 243c0f737938..bdc270e84b30 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -753,7 +753,7 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
/**
* netfs_resize_file - Note that a file got resized
- * @ctx: The netfs inode being resized
+ * @ictx: The netfs inode being resized
* @new_i_size: The new file size
* @changed_on_server: The change was applied to the server
*
^ permalink raw reply related
* [PATCH v2 08/14] scatterlist: Fix offset in folio calc in extract_xarray_to_sg()
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Matthew Wilcox,
Christoph Hellwig, Jens Axboe, Mike Marshall
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix the calculation of the offset in the folio being extracted in
extract_xarray_to_sg().
Note that in the near future, ITER_XARRAY should be removed.
Fixes: f5f82cd18732 ("Move netfs_extract_iter_to_sg() to lib/scatterlist.c")
Link: https://sashiko.dev/#/patchset/20260608145432.681865-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: Christoph Hellwig <hch@infradead.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Mike Marshall <hubcap@omnibond.com>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
lib/scatterlist.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index b7fe91ef35b8..6ea40d2e6247 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -1366,6 +1366,7 @@ static ssize_t extract_xarray_to_sg(struct iov_iter *iter,
sg_max--;
maxsize -= len;
+ start += len;
ret += len;
if (maxsize <= 0 || sg_max == 0)
break;
^ permalink raw reply related
* [PATCH v2 07/14] iov_iter: Remove unused variable in kunit_iov_iter.c
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Ming Lei, Matthew Wilcox,
Christoph Hellwig, Jens Axboe
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Remove the no longer used variable 'b' from iov_kunit_copy_to_bvec(). The
variable is initialised and incremented, but nothing now makes use of the
value.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ming Lei <ming.lei@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: Christoph Hellwig <hch@infradead.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
lib/tests/kunit_iov_iter.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/lib/tests/kunit_iov_iter.c b/lib/tests/kunit_iov_iter.c
index f02f7b7aa796..e7e154f94f66 100644
--- a/lib/tests/kunit_iov_iter.c
+++ b/lib/tests/kunit_iov_iter.c
@@ -275,7 +275,7 @@ static void __init iov_kunit_copy_to_bvec(struct kunit *test)
struct page **spages, **bpages;
u8 *scratch, *buffer;
size_t bufsize, npages, size, copied;
- int i, b, patt;
+ int i, patt;
bufsize = 0x100000;
npages = bufsize / PAGE_SIZE;
@@ -298,10 +298,9 @@ static void __init iov_kunit_copy_to_bvec(struct kunit *test)
KUNIT_EXPECT_EQ(test, iter.nr_segs, 0);
/* Build the expected image in the scratch buffer. */
- b = 0;
patt = 0;
memset(scratch, 0, bufsize);
- for (pr = bvec_test_ranges; pr->from >= 0; pr++, b++) {
+ for (pr = bvec_test_ranges; pr->from >= 0; pr++) {
u8 *p = scratch + pr->page * PAGE_SIZE;
for (i = pr->from; i < pr->to; i++)
^ permalink raw reply related
* [PATCH v2 06/14] iov_iter: Fix a memory leak in iov_iter_extract_user_pages()
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Matthew Wilcox,
Christoph Hellwig, Jens Axboe
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
There's a potential memory leak in callers of iov_iter_extract_user_pages()
whereby if a pages array is allocated in function, it isn't freed before
returning of an error or 0.
Now, it's not a leak per se in iov_iter_extract_user_pages() as, if an
array is allocated, it's returned through *pages, so it's incumbent on the
caller to free it. However, not all callers do.
Fix this by freeing the table and clearing *pages before returning an error
or 0. Note that iov_iter_extract_pages() and its subfunctions are allowed
to return 0 without returning an array (for instance if the iterator count
is 0).
Fixes: 7d58fe731028 ("iov_iter: Add a function to extract a page list from an iterator")
Closes: https://sashiko.dev/#/patchset/20260616100821.2062304-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: Christoph Hellwig <hch@infradead.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
lib/iov_iter.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 1661c237b643..5a6434709716 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1756,6 +1756,7 @@ static ssize_t iov_iter_extract_user_pages(struct iov_iter *i,
unsigned long addr;
unsigned int gup_flags = 0;
size_t offset;
+ bool will_alloc = !*pages;
int res;
if (i->data_source == ITER_DEST)
@@ -1772,8 +1773,14 @@ static ssize_t iov_iter_extract_user_pages(struct iov_iter *i,
if (!maxpages)
return -ENOMEM;
res = pin_user_pages_fast(addr, maxpages, gup_flags, *pages);
- if (unlikely(res <= 0))
+ if (unlikely(res <= 0)) {
+ if (will_alloc) {
+ kvfree(*pages);
+ *pages = NULL;
+ }
return res;
+ }
+
maxsize = min_t(size_t, maxsize, res * PAGE_SIZE - offset);
iov_iter_advance(i, maxsize);
return maxsize;
^ permalink raw reply related
* [PATCH v2 05/14] iov_iter: Fix missing alloc fail check in iov_iter_extract_bvec_pages()
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Ming Lei, Matthew Wilcox,
Christoph Hellwig, Jens Axboe
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix iov_iter_extract_bvec_pages() to check if want_pages_array() fails and,
if so, return -ENOMEM appropriately.
Fixes: e4e535bff2bc ("iov_iter: don't require contiguous pages in iov_iter_extract_bvec_pages")
Link: https://sashiko.dev/#/patchset/20260608145432.681865-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ming Lei <ming.lei@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: Christoph Hellwig <hch@infradead.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
lib/iov_iter.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 5f60733f3280..1661c237b643 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1637,6 +1637,8 @@ static ssize_t iov_iter_extract_bvec_pages(struct iov_iter *i,
bi.bi_bvec_done = skip;
maxpages = want_pages_array(pages, maxsize, skip, maxpages);
+ if (!maxpages)
+ return -ENOMEM;
while (bi.bi_size && bi.bi_idx < i->nr_segs) {
struct bio_vec bv = bvec_iter_bvec(i->bvec, bi);
^ permalink raw reply related
* [PATCH v2 04/14] iov_iter: Fix potential underflow in iov_iter_extract_xarray_pages()
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Matthew Wilcox,
Christoph Hellwig, Jens Axboe, Mike Marshall
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
In iov_iter_extract_xarray_pages(), if no pages are extracted because
there's a hole (or something otherwise unextractable) in the xarray, then
the calculation of maxsize at the end can go wrong if the starting offset
is not zero.
Fix this by returning 0 in such a case and freeing the page array if
allocated here rather than being passed in.
Note that in the near future, ITER_XARRAY should be removed.
Fixes: 7d58fe731028 ("iov_iter: Add a function to extract a page list from an iterator")
Link: https://sashiko.dev/#/patchset/20260608145432.681865-1-dhowells%40redhat.com
Link: https://sashiko.dev/#/patchset/20260616100821.2062304-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: Christoph Hellwig <hch@infradead.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Mike Marshall <hubcap@omnibond.com>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
lib/iov_iter.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 243662af1af7..5f60733f3280 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1568,6 +1568,7 @@ static ssize_t iov_iter_extract_xarray_pages(struct iov_iter *i,
struct folio *folio;
unsigned int nr = 0, offset;
loff_t pos = i->xarray_start + i->iov_offset;
+ bool will_alloc = !*pages;
XA_STATE(xas, i->xarray, pos >> PAGE_SHIFT);
offset = pos & ~PAGE_MASK;
@@ -1595,6 +1596,14 @@ static ssize_t iov_iter_extract_xarray_pages(struct iov_iter *i,
}
rcu_read_unlock();
+ if (!nr) {
+ if (will_alloc) {
+ kvfree(*pages);
+ *pages = NULL;
+ }
+ return 0;
+ }
+
maxsize = min_t(size_t, nr * PAGE_SIZE - offset, maxsize);
iov_iter_advance(i, maxsize);
return maxsize;
^ permalink raw reply related
* [PATCH v2 03/14] cachefiles: Fix file burial to take lock when unsetting S_KERNEL_FILE
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, NeilBrown
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix cachefiles_bury_object() to lock the inode of the file being buried
whilst it unsets the S_KERNEL_FILE flag.
Fixes: 07a90e97400c ("cachefiles: Implement culling daemon commands")
Closes: https://sashiko.dev/#/patchset/20260616100821.2062304-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: NeilBrown <neil@brown.name>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
fs/cachefiles/namei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index a464c4646c04..d4309dfb55df 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -375,7 +375,7 @@ int cachefiles_bury_object(struct cachefiles_cache *cache,
"Rename failed with error %d", ret);
}
- __cachefiles_unmark_inode_in_use(object, d_inode(rep));
+ cachefiles_do_unmark_inode_in_use(object, d_inode(rep));
end_renaming(&rd);
_leave(" = 0");
return 0;
^ permalink raw reply related
* [PATCH v2 02/14] cachefiles: Fix double fput
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
Fix a double fput() in error handling in cachefiles_create_tmpfile().
Link: https://sashiko.dev/#/patchset/20260608145432.681865-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
fs/cachefiles/namei.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index 2937db690b40..a464c4646c04 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -467,7 +467,6 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object)
ret = -EINVAL;
if (unlikely(!file->f_op->read_iter) ||
unlikely(!file->f_op->write_iter)) {
- fput(file);
pr_notice("Cache does not support read_iter and write_iter\n");
goto err_unuse;
}
^ permalink raw reply related
* [PATCH v2 01/14] netfs: Fix decision whether to disallow write-streaming due to fscache use
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel, Marc Dionne
In-Reply-To: <20260624115737.2964520-1-dhowells@redhat.com>
netfs_perform_write() buffers data by writing it into the pagecache for
later writeback. If the folio it wants to write to isn't present, it uses
"write streaming" in which is will store partial data in a non-uptodate,
but dirty folio.
However, when fscache is in use, this is a potential problem as writes to
the cache have to be aligned to the cache backend's DIO granularity, and so
netfs_perform_write() attempts to suppress write-streaming in such a case,
requiring the folio content to be fetched first unless the entire folio is
going to be overwritten. This allows the content to be written to the
cache too.
Unfortunately, the test netfs_perform_write() uses isn't correct because it
doesn't take into account the fact that the object lookup is asynchronous
and farmed off to a work queue, so there's a short window in which the
cache is doing a lookup but the test fails because the answer is undefined.
This can be triggered by the generic/464 xfstest, and causes a warning to
be emitted in cachefiles (in code not yet upstream) because it sees a write
that doesn't have its bounds rounded out to DIO alignment.
Fix this by changing the condition to whether FSCACHE_COOKIE_IS_CACHING is
set on a cookie rather than whether the cookie is marked enabled. Note
that this is really just a hint as to whether we allow write streaming or
not and no other aspects of the cookie or cache object are accessed.
Reported-by: Marc Dionne <marc.dionne@auristor.com>
cc: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
fs/netfs/buffered_write.c | 2 +-
fs/netfs/internal.h | 12 ++++++++++++
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 6bde3320bcec..2cdb68e6b16f 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -277,7 +277,7 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
* caching service temporarily because the backing store got
* culled.
*/
- if (netfs_is_cache_enabled(ctx)) {
+ if (netfs_is_cache_maybe_enabled(ctx)) {
if (finfo) {
netfs_stat(&netfs_n_wh_wstream_conflict);
goto flush_content;
diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h
index 645996ecfc80..d889caa401dc 100644
--- a/fs/netfs/internal.h
+++ b/fs/netfs/internal.h
@@ -239,6 +239,18 @@ static inline bool netfs_is_cache_enabled(struct netfs_inode *ctx)
#endif
}
+static inline bool netfs_is_cache_maybe_enabled(struct netfs_inode *ctx)
+{
+#if IS_ENABLED(CONFIG_FSCACHE)
+ struct fscache_cookie *cookie = ctx->cache;
+
+ return fscache_cookie_valid(cookie) &&
+ test_bit(FSCACHE_COOKIE_IS_CACHING, &cookie->flags);
+#else
+ return false;
+#endif
+}
+
/*
* Get a ref on a netfs group attached to a dirty page (e.g. a ceph snap).
*/
^ permalink raw reply related
* [PATCH v2 00/14] netfs: Miscellaneous fixes
From: David Howells @ 2026-06-24 11:57 UTC (permalink / raw)
To: Christian Brauner
Cc: David Howells, Paulo Alcantara, netfs, linux-afs, linux-cifs,
ceph-devel, linux-fsdevel, linux-kernel
Hi Christian,
Here are some miscellaneous fixes for netfslib. I separated them from my
netfs-next branch. Various Sashiko review comments[1][2][3] are addressed:
(1) Fix the decision whether to disallow write-streaming due to fscache
use.
(2) Fix a double fput in cachefiles_create_tmpfile().
(3) Fix alteration of S_KERNEL_FILE inode flag without holding inode lock.
(4) Fix a potential mathematical underflow in
iov_iter_extract_xarray_pages() and make it return 0 and free the
array if no pages could be extracted.
(5) Fix a missing alloc failure check in iov_iter_extract_bvec_pages().
(6) Fix iov_iter_extract_user_pages() so that it doesn't leak the pages
array if it returns an error or 0 (inasmuch as the leak is really in
the callers).
(7) Remove an unused variable in kunit_iov_iter.c.
(8) Fix extract_xarray_to_sg() to calculate folio offset correctly.
(9) Fix a kdoc comment.
(10) Replace the netfs_inode::wb_lock mutex with a bit lock so that the
lock can be passed to the collector so that multiple asynchronous
writebacks won't interfere with each other.
(11) Fix writeback error handling to go through writeback_iter() so that it
can clean up its state.
(12) Fix ENOMEM handling in writeback to clean up the current folio if we
can't allocate a rolling buffer segment.
(13) Fix unbuffered/DIO write retry for filesystems that don't have a
->prepare_write() method.
The patches can also be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes
Thanks,
David
[1] https://sashiko.dev/#/patchset/20260608145432.681865-1-dhowells%40redhat.com
[2] https://sashiko.dev/#/patchset/20260616100821.2062304-1-dhowells%40redhat.com
[3] https://sashiko.dev/#/patchset/20260619140646.2633762-1-dhowells%40redhat.com
Changes
=======
ver #2)
- Addressed various Sashiko comments[3]:
- Changed some kfree() to kvfree().
- Added some writeback error handling patches.
- Fixed DIO write retry.
ver #1)
- Added a fix for S_KERNEL_FILE in cachefiles_bury_object().
- Fixed how iov_iter_extract_*_pages() deals with *pages when returning 0
or error.
- Fixed the replacement of wb_lock with bit lock:
- Only release the bit lock for writeback, writethrough and write-single,
not for PG_private_2-based I/O, which doesn't take the lock.
- In netfs_writeback_single(), need to redirty the inode if can't get the
lock.
David Howells (14):
netfs: Fix decision whether to disallow write-streaming due to fscache
use
cachefiles: Fix double fput
cachefiles: Fix file burial to take lock when unsetting S_KERNEL_FILE
iov_iter: Fix potential underflow in iov_iter_extract_xarray_pages()
iov_iter: Fix missing alloc fail check in
iov_iter_extract_bvec_pages()
iov_iter: Fix a memory leak in iov_iter_extract_user_pages()
iov_iter: Remove unused variable in kunit_iov_iter.c
scatterlist: Fix offset in folio calc in extract_xarray_to_sg()
netfs: Fix kdoc warning
netfs: Replace wb_lock with a bit lock for asynchronicity
netfs: Fix writethrough to use collection offload
netfs: Fix writeback error handling
netfs: Fix folio state after ENOMEM whilst under writeback iteration
netfs: Fix DIO write retry for filesystems without a ->prepare_write()
fs/afs/symlink.c | 4 +-
fs/cachefiles/namei.c | 3 +-
fs/netfs/buffered_write.c | 2 +-
fs/netfs/direct_write.c | 6 ---
fs/netfs/internal.h | 12 +++++
fs/netfs/locking.c | 95 ++++++++++++++++++++++++++++++++++++++
fs/netfs/write_collect.c | 10 ++++
fs/netfs/write_issue.c | 48 ++++++++-----------
include/linux/netfs.h | 13 ++++--
lib/iov_iter.c | 20 +++++++-
lib/scatterlist.c | 1 +
lib/tests/kunit_iov_iter.c | 5 +-
12 files changed, 171 insertions(+), 48 deletions(-)
^ permalink raw reply
* Re: [PATCH] netfs: Fix UAF in netfs_unbuffered_write() on failed preparation
From: David Howells @ 2026-06-24 11:02 UTC (permalink / raw)
To: ChenXiaoSong
Cc: dhowells, hongao, Paulo Alcantara, netfs, linux-fsdevel,
linux-kernel, syzbot+3c74b1f0c372e98efc32, Steve French,
Namjae Jeon, linux-cifs@vger.kernel.org
In-Reply-To: <498534d3-e82e-40ce-bc7d-230580b2fcae@chenxiaosong.com>
I suspect the issue is this bit in netfs_unbuffered_write():
for (;;) {
...
netfs_get_subrequest(subreq, netfs_sreq_trace_get_resubmit);
if (stream->prepare_write) {
stream->prepare_write(subreq);
__set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags);
netfs_stat(&netfs_n_wh_retry_write_subreq);
} else {
struct iov_iter source;
netfs_reset_iter(subreq);
source = subreq->io_iter;
netfs_reissue_write(stream, subreq, &source); <----
}
}
This doesn't happen with AFS because it has a ->prepare_write() method. Does
this change fix the problem for you?
diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
index 25f8ceb15fad..9f6258da45d6 100644
--- a/fs/netfs/direct_write.c
+++ b/fs/netfs/direct_write.c
@@ -190,12 +190,6 @@ static int netfs_unbuffered_write(struct netfs_io_request *wreq)
stream->prepare_write(subreq);
__set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags);
netfs_stat(&netfs_n_wh_retry_write_subreq);
- } else {
- struct iov_iter source;
-
- netfs_reset_iter(subreq);
- source = subreq->io_iter;
- netfs_reissue_write(stream, subreq, &source);
}
}
David
^ permalink raw reply related
* Re: [PATCH] netfs: Fix UAF in netfs_unbuffered_write() on failed preparation
From: David Howells @ 2026-06-24 10:59 UTC (permalink / raw)
To: ChenXiaoSong
Cc: dhowells, hongao, Paulo Alcantara, netfs, linux-fsdevel,
linux-kernel, syzbot+3c74b1f0c372e98efc32, Steve French,
Namjae Jeon, linux-cifs@vger.kernel.org
In-Reply-To: <498534d3-e82e-40ce-bc7d-230580b2fcae@chenxiaosong.com>
Actually, this shows what happened:
a.out-717 : netfs_sreq: R=00000001[13] UPLD RETRY f=210 s=240000 1fac0/20000 s=160 e=-11
a.out-717 : netfs_sreq: R=00000001[13] UPLD SUBMT f=100 s=240000 0/dbfac0 s=192 e=0
a.out-717 : netfs_failure: R=00000001[13] UPLD f=500 s=240000 0/dbfac0 write e=-5
a.out-717 : netfs_rreq: R=00000001 UW PAUSE f=1801
a.out-717 : netfs_sreq: R=00000001[13] UPLD TERM f=500 s=240000 0/dbfac0 s=192 e=-5
a.out-717 : netfs_rreq: R=00000001 UW WAKE-Q f=1805
a.out-717 : netfs_failure: R=00000001[13] UPLD f=400 s=240000 0/dbfac0 write e=-5
a.out-717 : netfs_rreq: R=00000001 UW PAUSE f=1805
a.out-717 : netfs_sreq: R=00000001[13] UPLD TERM f=400 s=240000 0/dbfac0 s=192 e=-5
a.out-717 : netfs_rreq: R=00000001 UW WAKE-Q f=1805
a.out-717 : netfs_sreq: R=00000001[13] UPLD FREE f=400 s=240000 0/dbfac0 s=192 e=-5
There are two calls to netfs_write_subrequest_terminated(). You can see the
IN_PROGRESS bit has been cleared by the second one (f=500 -> f=400).
David
^ permalink raw reply
* Re: Request CVE for ksmbd authenticated permission bypass in FSCTL_SET_SPARSE handle
From: Greg KH @ 2026-06-24 9:57 UTC (permalink / raw)
To: grayhat@foxmail.com; +Cc: cve, sfrench, namjae.jeon, linux-cifs
In-Reply-To: <tencent_32075C41A32AB2B941A3641C72794B4B6408@qq.com>
On Wed, Jun 24, 2026 at 05:53:48PM +0800, grayhat@foxmail.com wrote:
> Hi CNA team,
>
> I am writing to request a CVE ID for a permission bypass vulnerability in the ksmbd FSCTL_SET_SPARSE handler. A fix for this issue has been merged into the mainline Linux kernel and backported to stable kernels, but no corresponding CVE has yet been assigned.
>
> Vulnerability classification: CWE-274 (Improper Handling of Insufficient Privileges)
> Commit references:
> - Mainline fix commit: cc57232cae23c0df91b4a59d0f519141ce9b5b02 ("ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE")
> - Stable backport releases: v7.0.12, v6.18.35, v6.6.143 LTS
Now assigned CVE-2026-52944
Hope this helps,
greg k-h
^ permalink raw reply
* Request CVE for ksmbd authenticated permission bypass in FSCTL_SET_SPARSE handle
From: grayhat @ 2026-06-24 9:53 UTC (permalink / raw)
To: cve; +Cc: sfrench, namjae.jeon, linux-cifs
Hi CNA team,
I am writing to request a CVE ID for a permission bypass vulnerability in the ksmbd FSCTL_SET_SPARSE handler. A fix for this issue has been merged into the mainline Linux kernel and backported to stable kernels, but no corresponding CVE has yet been assigned.
Vulnerability classification: CWE-274 (Improper Handling of Insufficient Privileges)
Commit references:
- Mainline fix commit: cc57232cae23c0df91b4a59d0f519141ce9b5b02 ("ksmbd: fix FSCTL permission bypass by adding a permission check for FSCTL_SET_SPARSE")
- Stable backport releases: v7.0.12, v6.18.35, v6.6.143 LTS
Note: ksmbd features are developed in a standalone Samba repository and merged into mainline Linux in periodic bulk pull requests. The FSCTL_SET_SPARSE handler landed via one such batch merge, so there is no dedicated mainline commit solely introducing this handler. Correspondingly, the fix commit does not carry a precise Fixes: tag referencing a mainline introduction commit.
The ksmbd FSCTL_SET_SPARSE handler is missing both share-level and handle-level write permission checks that are enforced for other write-sensitive FSCTL operations.
Within smb2_ioctl() (fs/smb/server/smb2pdu.c), the branch handling FSCTL_SET_SPARSE invokes fsctl_set_sparse() without validating two critical access conditions:
1. Share writable flag: test_tree_conn_flag(work->tcon, KSMBD_TREE_CONN_FLAG_WRITABLE)
2. Per-handle write access: fp->daccess & (FILE_WRITE_DATA_LE | FILE_WRITE_ATTRIBUTES_LE)
For comparison, related destructive operations such as FSCTL_COPYCHUNK and FSCTL_SET_ZERO_DATA both explicitly check KSMBD_TREE_CONN_FLAG_WRITABLE, while FSCTL_SET_SPARSE omits this protection entirely.
Internally, fsctl_set_sparse() modifies the in-memory file attribute flag fp->f_ci->m_fattr. When the Samba share option `store dos attributes = yes` is enabled, it also persists DOS attribute state via vfs_setxattr(). Unlike vfs_write() or vfs_fallocate(), vfs_setxattr() does not validate FMODE_WRITE at the VFS layer, meaning there is no underlying kernel enforcement to compensate for missing permission checks inside ksmbd.
As a result, an authenticated SMB client with read-only access to a share is able to alter the sparse attribute of arbitrary files. This bypasses both share-level read-only restrictions and file-handle write access requirements.
This behavior violates the MS-FSA specification section §2.1.5.10.38:
"If Open.GrantedAccess.FILE_WRITE_DATA is FALSE and Open.GrantedAccess.FILE_WRITE_ATTRIBUTES is FALSE, the operation MUST be failed with STATUS_ACCESS_DENIED."
ksmbd does not implement this mandatory access control check.
Thanks,
Sean Shen
grayhat@foxmail.com
^ permalink raw reply
* Re: [f2fs-dev] [PATCH v8 00/17] Subject: Exposing case folding behavior
From: patchwork-bot+f2fs @ 2026-06-24 8:59 UTC (permalink / raw)
To: Chuck Lever
Cc: viro, brauner, jack, pc, yuezhang.mo, cem, almaz.alexandrovich,
adilger.kernel, linux-cifs, sfrench, slava, linux-ext4,
linkinjeon, sprasad, frank.li, ronniesahlberg, glaubitz, jaegeuk,
hirofumi, linux-nfs, tytso, linux-api, linux-f2fs-devel,
linux-xfs, senozhatsky, chuck.lever, hansg, anna, linux-fsdevel,
sj1557.seo, trondmy
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>
Hello:
This series was applied to jaegeuk/f2fs.git (dev)
by Christian Brauner <brauner@kernel.org>:
On Tue, 17 Feb 2026 16:47:24 -0500 you wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Following on from
>
> https://lore.kernel.org/linux-nfs/20251021-zypressen-bazillus-545a44af57fd@brauner/T/#m0ba197d75b7921d994cf284f3cef3a62abb11aaa
>
> I'm attempting to implement enough support in the Linux VFS to
> enable file services like NFSD and ksmbd (and user space
> equivalents) to provide the actual status of case folding support
> in local file systems. The default behavior for local file systems
> not explicitly supported in this series is to reflect the usual
> POSIX behaviors:
>
> [...]
Here is the summary with links:
- [f2fs-dev,v8,01/17] fs: Move file_kattr initialization to callers
(no matching commit)
- [f2fs-dev,v8,02/17] fs: Add case sensitivity flags to file_kattr
https://git.kernel.org/jaegeuk/f2fs/c/3035e4454142
- [f2fs-dev,v8,03/17] fat: Implement fileattr_get for case sensitivity
(no matching commit)
- [f2fs-dev,v8,04/17] exfat: Implement fileattr_get for case sensitivity
(no matching commit)
- [f2fs-dev,v8,05/17] ntfs3: Implement fileattr_get for case sensitivity
(no matching commit)
- [f2fs-dev,v8,06/17] hfs: Implement fileattr_get for case sensitivity
(no matching commit)
- [f2fs-dev,v8,07/17] hfsplus: Report case sensitivity in fileattr_get
(no matching commit)
- [f2fs-dev,v8,08/17] ext4: Report case sensitivity in fileattr_get
(no matching commit)
- [f2fs-dev,v8,09/17] xfs: Report case sensitivity in fileattr_get
(no matching commit)
- [f2fs-dev,v8,10/17] cifs: Implement fileattr_get for case sensitivity
(no matching commit)
- [f2fs-dev,v8,11/17] nfs: Implement fileattr_get for case sensitivity
(no matching commit)
- [f2fs-dev,v8,12/17] f2fs: Add case sensitivity reporting to fileattr_get
(no matching commit)
- [f2fs-dev,v8,13/17] vboxsf: Implement fileattr_get for case sensitivity
(no matching commit)
- [f2fs-dev,v8,14/17] isofs: Implement fileattr_get for case sensitivity
(no matching commit)
- [f2fs-dev,v8,15/17] nfsd: Report export case-folding via NFSv3 PATHCONF
(no matching commit)
- [f2fs-dev,v8,16/17] nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
(no matching commit)
- [f2fs-dev,v8,17/17] ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
page: next (older)
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox