* Request to backport data corruption fix to stable
@ 2025-11-14 11:12 Shyam Prasad N
2025-11-14 15:00 ` Sasha Levin
0 siblings, 1 reply; 4+ messages in thread
From: Shyam Prasad N @ 2025-11-14 11:12 UTC (permalink / raw)
To: Greg KH, sashal
Cc: David Howells, Bharath SM, Steve French, CIFS, Enzo Matsumiya
[-- Attachment #1: Type: text/plain, Size: 825 bytes --]
Hi Greg/Sasha,
Over the last few months, a few users have reported a data corruption
with Linux SMB kernel client filesystem. This is one such report:
https://lore.kernel.org/linux-cifs/36fb31bf2c854cdc930a3415f5551dcd@izw-berlin.de/
The issue is now well understood. Attached is a fix for this issue.
I've made sure that the fix is stopping the data corruption and also
not regressing other write patterns.
The regression seems to have been introduced during a refactoring of
this code path during the v6.3 and continued to exist till v6.9,
before the code was refactored again with netfs helper library
integration in v6.10.
I request you to include this change in all stable trees for
v6.3..v6.9. I've done my testing on stable-6.6. Please let me know if
you want this tested on any other kernels.
--
Regards,
Shyam
[-- Attachment #2: 0001-cifs-stop-writeback-extension-when-change-of-size-is.patch --]
[-- Type: application/octet-stream, Size: 3440 bytes --]
From c4494f7ad7c0dd46b67dc67058507278e56c9311 Mon Sep 17 00:00:00 2001
From: Shyam Prasad N <sprasad@microsoft.com>
Date: Fri, 14 Nov 2025 14:30:55 +0530
Subject: [PATCH] cifs: stop writeback extension when change of size is
detected
cifs_extend_writeback can pick up a folio on an extending write which
has been dirtied, but we have aclamp on the writeback to an i_size
local variable, which can cause short writes, yet mark the page as clean.
This can cause a data corruption.
As an example, consider this scenario:
1. First write to the file happens offset 0 len 5k.
2. Writeback starts for the range (0-5k).
3. Writeback locks page 1 in cifs_writepages_begin. But does not lock
page 2 yet.
4. Page 2 is now written to by the next write, which extends the file
by another 5k. Page 2 and 3 are now marked dirty.
5. Now we reach cifs_extend_writeback, where we extend to include the
next folio (even if it should be partially written). We will mark page
2 for writeback.
6. But after exiting cifs_extend_writeback, we will clamp the
writeback to i_size, which was 5k when it started. So we write only 1k
bytes in page 2.
7. We still will now mark page 2 as flushed and mark it clean. So
remaining contents of page 2 will not be written to the server (hence
the hole in that gap, unless that range gets overwritten).
With this patch, we will make sure not extend the writeback anymore
when a change in the file size is detected.
This fix also changes the error handling of cifs_extend_writeback when
a folio get fails. We will now stop the extension when a folio get fails.
Cc: stable@kernel.org # v6.3~v6.9
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Acked-by: David Howells <dhowells@redhat.com>
Reported-by: Mark A Whiting <whitingm@opentext.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
---
fs/smb/client/file.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index 7a2b81fbd9cfd..cc2ba2b18c8d4 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -2747,8 +2747,10 @@ static void cifs_extend_writeback(struct address_space *mapping,
loff_t start,
int max_pages,
loff_t max_len,
- size_t *_len)
+ size_t *_len,
+ unsigned long long i_size)
{
+ struct inode *inode = mapping->host;
struct folio_batch batch;
struct folio *folio;
unsigned int nr_pages;
@@ -2779,7 +2781,7 @@ static void cifs_extend_writeback(struct address_space *mapping,
if (!folio_try_get(folio)) {
xas_reset(xas);
- continue;
+ break;
}
nr_pages = folio_nr_pages(folio);
if (nr_pages > max_pages) {
@@ -2799,6 +2801,15 @@ static void cifs_extend_writeback(struct address_space *mapping,
xas_reset(xas);
break;
}
+
+ /* if file size is changing, stop extending */
+ if (i_size_read(inode) != i_size) {
+ folio_unlock(folio);
+ folio_put(folio);
+ xas_reset(xas);
+ break;
+ }
+
if (!folio_test_dirty(folio) ||
folio_test_writeback(folio)) {
folio_unlock(folio);
@@ -2930,7 +2941,8 @@ static ssize_t cifs_write_back_from_locked_folio(struct address_space *mapping,
if (max_pages > 0)
cifs_extend_writeback(mapping, xas, &count, start,
- max_pages, max_len, &len);
+ max_pages, max_len, &len,
+ i_size);
}
}
len = min_t(unsigned long long, len, i_size - start);
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: Request to backport data corruption fix to stable
2025-11-14 11:12 Request to backport data corruption fix to stable Shyam Prasad N
@ 2025-11-14 15:00 ` Sasha Levin
2025-11-14 18:35 ` Steve French
0 siblings, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2025-11-14 15:00 UTC (permalink / raw)
To: Shyam Prasad N
Cc: Greg KH, David Howells, Bharath SM, Steve French, CIFS,
Enzo Matsumiya
On Fri, Nov 14, 2025 at 04:42:57PM +0530, Shyam Prasad N wrote:
>Hi Greg/Sasha,
>
>Over the last few months, a few users have reported a data corruption
>with Linux SMB kernel client filesystem. This is one such report:
>https://lore.kernel.org/linux-cifs/36fb31bf2c854cdc930a3415f5551dcd@izw-berlin.de/
>
>The issue is now well understood. Attached is a fix for this issue.
>I've made sure that the fix is stopping the data corruption and also
>not regressing other write patterns.
>
>The regression seems to have been introduced during a refactoring of
>this code path during the v6.3 and continued to exist till v6.9,
>before the code was refactored again with netfs helper library
>integration in v6.10.
>
>I request you to include this change in all stable trees for
>v6.3..v6.9. I've done my testing on stable-6.6. Please let me know if
>you want this tested on any other kernels.
I'll queue it up for 6.6, thanks!
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Request to backport data corruption fix to stable
2025-11-14 15:00 ` Sasha Levin
@ 2025-11-14 18:35 ` Steve French
2025-11-15 13:34 ` Sasha Levin
0 siblings, 1 reply; 4+ messages in thread
From: Steve French @ 2025-11-14 18:35 UTC (permalink / raw)
To: Sasha Levin
Cc: Shyam Prasad N, Greg KH, David Howells, Bharath SM, CIFS,
Enzo Matsumiya, Stable
[-- Attachment #1: Type: text/plain, Size: 1425 bytes --]
Sasha,
Also wanted to make sure stable gets this patch from David Howells
that fixes a regression that was mentioned in some of the same email
threads.
It fixes an important stable regression in cifs_readv when cache=none.
Bharath has also reviewed and tested it with 6.6 stable. See
attached.
On Fri, Nov 14, 2025 at 9:00 AM Sasha Levin <sashal@kernel.org> wrote:
>
> On Fri, Nov 14, 2025 at 04:42:57PM +0530, Shyam Prasad N wrote:
> >Hi Greg/Sasha,
> >
> >Over the last few months, a few users have reported a data corruption
> >with Linux SMB kernel client filesystem. This is one such report:
> >https://lore.kernel.org/linux-cifs/36fb31bf2c854cdc930a3415f5551dcd@izw-berlin.de/
> >
> >The issue is now well understood. Attached is a fix for this issue.
> >I've made sure that the fix is stopping the data corruption and also
> >not regressing other write patterns.
> >
> >The regression seems to have been introduced during a refactoring of
> >this code path during the v6.3 and continued to exist till v6.9,
> >before the code was refactored again with netfs helper library
> >integration in v6.10.
> >
> >I request you to include this change in all stable trees for
> >v6.3..v6.9. I've done my testing on stable-6.6. Please let me know if
> >you want this tested on any other kernels.
>
> I'll queue it up for 6.6, thanks!
>
> --
> Thanks,
> Sasha
--
Thanks,
Steve
[-- Attachment #2: 0001-cifs-Fix-uncached-read-into-ITER_KVEC-iterator.patch --]
[-- Type: text/x-patch, Size: 4589 bytes --]
From 5c026335eef10805a3b5d3624c464f2393f859e2 Mon Sep 17 00:00:00 2001
From: David Howells <dhowells@redhat.com>
Date: Fri, 14 Nov 2025 06:13:35 +0000
Subject: [PATCH] cifs: Fix uncached read into ITER_KVEC iterator
If a cifs share is mounted cache=none, internal reads (such as by exec)
will pass a KVEC iterator down from __cifs_readv() to
cifs_send_async_read() which will then call cifs_limit_bvec_subset() upon
it to limit the number of contiguous elements for RDMA purposes. This
doesn't work on non-BVEC iterators, however.
Fix this by extracting a KVEC iterator into a BVEC iterator in
__cifs_readv() (it would be dup'd anyway it async).
This caused the following warning:
WARNING: CPU: 0 PID: 6290 at fs/smb/client/file.c:3549 cifs_limit_bvec_subset+0xe/0xc0
...
Call Trace:
<TASK>
cifs_send_async_read+0x146/0x2e0
__cifs_readv+0x207/0x2d0
__kernel_read+0xf6/0x160
search_binary_handler+0x49/0x210
exec_binprm+0x4a/0x140
bprm_execve.part.0+0xe4/0x170
do_execveat_common.isra.0+0x196/0x1c0
do_execve+0x1f/0x30
Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Acked-by: Bharath SM <bharathsm@microsoft.com>
Tested-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: stable@kernel.org # v6.6~v6.9
---
fs/smb/client/file.c | 97 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 94 insertions(+), 3 deletions(-)
diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
index 1058066913dd..4d38011413a0 100644
--- a/fs/smb/client/file.c
+++ b/fs/smb/client/file.c
@@ -37,6 +37,81 @@
#include "cifs_ioctl.h"
#include "cached_dir.h"
+/*
+ * Allocate a bio_vec array and extract up to sg_max pages from a KVEC-type
+ * iterator and add them to the array. This can deal with vmalloc'd buffers as
+ * well as kmalloc'd or static buffers. The pages are not pinned.
+ */
+static ssize_t extract_kvec_to_bvec(struct iov_iter *iter, ssize_t maxsize,
+ unsigned int bc_max,
+ struct bio_vec **_bv, unsigned int *_bc)
+{
+ const struct kvec *kv = iter->kvec;
+ struct bio_vec *bv;
+ unsigned long start = iter->iov_offset;
+ unsigned int i, bc = 0;
+ ssize_t ret = 0;
+
+ bc_max = iov_iter_npages(iter, bc_max);
+ if (bc_max == 0) {
+ *_bv = NULL;
+ *_bc = 0;
+ return 0;
+ }
+
+ bv = kvmalloc(array_size(bc_max, sizeof(*bv)), GFP_NOFS);
+ if (!bv) {
+ *_bv = NULL;
+ *_bc = 0;
+ return -ENOMEM;
+ }
+ *_bv = bv;
+
+ for (i = 0; i < iter->nr_segs; i++) {
+ struct page *page;
+ unsigned long kaddr;
+ size_t off, len, seg;
+
+ len = kv[i].iov_len;
+ if (start >= len) {
+ start -= len;
+ continue;
+ }
+
+ kaddr = (unsigned long)kv[i].iov_base + start;
+ off = kaddr & ~PAGE_MASK;
+ len = min_t(size_t, maxsize, len - start);
+ kaddr &= PAGE_MASK;
+
+ maxsize -= len;
+ ret += len;
+ do {
+ seg = umin(len, PAGE_SIZE - off);
+ if (is_vmalloc_or_module_addr((void *)kaddr))
+ page = vmalloc_to_page((void *)kaddr);
+ else
+ page = virt_to_page((void *)kaddr);
+
+ bvec_set_page(bv, page, len, off);
+ bv++;
+ bc++;
+
+ len -= seg;
+ kaddr += PAGE_SIZE;
+ off = 0;
+ } while (len > 0 && bc < bc_max);
+
+ if (maxsize <= 0 || bc >= bc_max)
+ break;
+ start = 0;
+ }
+
+ if (ret > 0)
+ iov_iter_advance(iter, ret);
+ *_bc = bc;
+ return ret;
+}
+
/*
* Remove the dirty flags from a span of pages.
*/
@@ -4318,11 +4393,27 @@ static ssize_t __cifs_readv(
ctx->bv = (void *)ctx->iter.bvec;
ctx->bv_need_unpin = iov_iter_extract_will_pin(to);
ctx->should_dirty = true;
- } else if ((iov_iter_is_bvec(to) || iov_iter_is_kvec(to)) &&
- !is_sync_kiocb(iocb)) {
+ } else if (iov_iter_is_kvec(to)) {
+ /*
+ * Extract a KVEC-type iterator into a BVEC-type iterator. We
+ * assume that the storage will be retained by the caller; in
+ * any case, we may or may not be able to pin the pages, so we
+ * don't try.
+ */
+ unsigned int bc;
+
+ rc = extract_kvec_to_bvec(to, iov_iter_count(to), INT_MAX,
+ &ctx->bv, &bc);
+ if (rc < 0) {
+ kref_put(&ctx->refcount, cifs_aio_ctx_release);
+ return rc;
+ }
+
+ iov_iter_bvec(&ctx->iter, ITER_DEST, ctx->bv, bc, rc);
+ } else if (iov_iter_is_bvec(to) && !is_sync_kiocb(iocb)) {
/*
* If the op is asynchronous, we need to copy the list attached
- * to a BVEC/KVEC-type iterator, but we assume that the storage
+ * to a BVEC-type iterator, but we assume that the storage
* will be retained by the caller; in any case, we may or may
* not be able to pin the pages, so we don't try.
*/
--
2.45.4
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: Request to backport data corruption fix to stable
2025-11-14 18:35 ` Steve French
@ 2025-11-15 13:34 ` Sasha Levin
0 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2025-11-15 13:34 UTC (permalink / raw)
To: Steve French
Cc: Shyam Prasad N, Greg KH, David Howells, Bharath SM, CIFS,
Enzo Matsumiya, Stable
On Fri, Nov 14, 2025 at 12:35:39PM -0600, Steve French wrote:
>Sasha,
>Also wanted to make sure stable gets this patch from David Howells
>that fixes a regression that was mentioned in some of the same email
>threads.
>It fixes an important stable regression in cifs_readv when cache=none.
> Bharath has also reviewed and tested it with 6.6 stable. See
>attached.
Sure, queued up. Thanks!
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-11-15 13:34 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-14 11:12 Request to backport data corruption fix to stable Shyam Prasad N
2025-11-14 15:00 ` Sasha Levin
2025-11-14 18:35 ` Steve French
2025-11-15 13:34 ` Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox