From: Christian Schoenebeck <linux_oss@crudebyte.com>
To: Pierre Barre <pierre@barre.sh>, asmadeus <asmadeus@codewreck.org>
Cc: ericvh@kernel.org, lucho@ionkov.net,
David Howells <dhowells@redhat.com>,
v9fs@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] 9p: fix i_size update race in getattr with writeback caching
Date: Wed, 04 Feb 2026 12:48:55 +0100 [thread overview]
Message-ID: <1946294.tdWV9SEqCh@weasel> (raw)
In-Reply-To: <64008E3D-7EEE-4E6C-9392-986C1D8E8858@barre.sh>
On Wednesday, 7 January 2026 01:27:50 CET Pierre Barre wrote:
> > On 5 Jan 2026, at 12:35, Christian Schoenebeck <linux_oss@crudebyte.com>
> > wrote:>
> > On Saturday, 27 December 2025 19:01:37 CET Pierre Barre wrote:
> >> With writeback caching (cache=mmap), v9fs_stat2inode() and
> >> v9fs_stat2inode_dotl() unconditionally overwrite i_size from the server
> >> response, even when dirty pages may exist locally. This causes processes
> >> using lseek(SEEK_END) to see incorrect file sizes, leading to data
> >> corruption when extending files while concurrent stat() calls occur.
> >>
> >> Fix by passing V9FS_STAT2INODE_KEEP_ISIZE when CACHE_WRITEBACK is
> >> enabled to preserve the client's authoritative i_size.
> >>
> >> Signed-off-by: Pierre Barre <pierre@barre.sh>
> >> ---
> >
> > Adding David Howells to this discussion.
> >
> > David, I read today that you suggested [1] to go for a wait approach
> > instead? So somewhat similar to what Pierre suggested in v1 [2].
>
> FWIW, I tested the initial approach I described and this seems to result in
> no real performance improvements compared to not using any writeback
> caching at all.
Right, surprisingly I don't measure a performance penalty either.
Dominique, suggestions how to proceed on this issue?
/Christian
> > [1] https://lore.kernel.org/all/1696785.1767599663@warthog.procyon.org.uk/
> > [2] https://lore.kernel.org/all/20251227083751.715152-1-pierre@barre.sh/
> >
> > I understand the point about cifs and nfs, where you must expect foreign
> > changes on server side by another client.
> >
> > For 9pfs though IMO it would make sense to distinguish:
> >
> > for cache=loose (and cache=fscache?) we are not expecting host side
> > changes, so we could safely go for the approach suggested by this patch
> > here and not wait for flushing dirty data and just use the locally cached
> > file size (for performance reasons).
> >
> > And for all other cache modes going for the cifs/nfs approach and wait for
> > flushing? Does this make sense?
> >
> > I just tested this reported issue with the following reproducer:
> >
> > --------------------------
> >
> > #!/bin/bash
> >
> > rm -f foo.txt
> >
> > for i in {1..50}
> > do
> >
> > echo $i >> foo.txt &
> > ls -lha foo.txt &
> >
> > done
> >
> > sync
> >
> > cat foo.txt
> >
> > echo
> >
> > wc -l foo.txt
> >
> > --------------------------
> >
> > For cache=loose I couldn't reproduce, but with cache=mmap I could
> > reproduce
> > data corruption, folio writeback hangup and the following error message:
> >
> > [ 60.847671] [append] R=134d: No submit
> >
> > ...
> >
> > [ 243.153292] INFO: task sync:986 blocked for more than 120 seconds.
> > [ 243.161516] Tainted: G E 6.19.0-rc1+ #107
> > [ 243.168847] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> > this message. [ 243.178702] task:sync state:D stack:0
> > pid:986 tgid:986 ppid:884 task_flags:0x400000 flags:0x00080002 [
> > 243.190379] Call Trace:
> > [ 243.192285] <TASK>
> >
> > [ 243.193891] __schedule (kernel/sched/core.c:5256
> > kernel/sched/core.c:6863) [ 243.195999] schedule
> > (kernel/sched/core.c:6946 kernel/sched/core.c:6960) [ 243.197730]
> > io_schedule (kernel/sched/core.c:7764 (discriminator 1)
> > kernel/sched/core.c:7790 (discriminator 1)) [ 243.199418]
> > folio_wait_bit_common (mm/filemap.c:1312)
> > [ 243.201451] ? __pfx_wake_page_function (mm/filemap.c:1134)
> > [ 243.203399] folio_wait_writeback (./arch/x86/include/asm/bitops.h:202
> > ./arch/x86/include/asm/bitops.h:232
> > ./include/asm-generic/bitops/instrumented-non-atomic.h:142
> > ./include/linux/page-flags.h:597 mm/page-writeback.c:3087) [ 243.205371]
> > __filemap_fdatawait_range (mm/filemap.c:529 (discriminator 1)) [
> > 243.207377] ? _raw_spin_unlock (./include/linux/spinlock_api_smp.h:143
> > (discriminator 3) kernel/locking/spinlock.c:186 (discriminator 3)) [
> > 243.209305] ? finish_task_switch.isra.0
> > (./arch/x86/include/asm/paravirt.h:671 kernel/sched/sched.h:1570
> > kernel/sched/core.c:4995 kernel/sched/core.c:5112) [ 243.212030] ?
> > __schedule (kernel/sched/core.c:5259)
> > [ 243.214083] ? wb_wait_for_completion (fs/fs-writeback.c:227)
> > [ 243.215925] filemap_fdatawait_keep_errors
> > (./arch/x86/include/asm/bitops.h:202 ./arch/x86/include/asm/bitops.h:232
> > ./include/asm-generic/bitops/instrumented-non-atomic.h:142
> > mm/filemap.c:363 mm/filemap.c:627) [ 243.217628] sync_inodes_sb
> > (./include/linux/sched.h:2062 fs/fs-writeback.c:2777
> > fs/fs-writeback.c:2897) [ 243.218914] ? __pfx_sync_inodes_one_sb
> > (fs/sync.c:75)
> > [ 243.220186] __iterate_supers (fs/super.c:64 fs/super.c:925)
> > [ 243.221601] ksys_sync (fs/sync.c:103)
> > [ 243.222746] __do_sys_sync (fs/sync.c:115)
> > [ 243.223950] do_syscall_64 (arch/x86/entry/syscall_64.c:63
> > (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1)) [
> > 243.226140] entry_SYSCALL_64_after_hwframe
> > (arch/x86/entry/entry_64.S:131) [ 243.228375] RIP: 0033:0x7f51e4613707
> > [ 243.230260] RSP: 002b:00007ffdb2fccad8 EFLAGS: 00000246 ORIG_RAX:
> > 00000000000000a2 [ 243.234092] RAX: ffffffffffffffda RBX:
> > 00007ffdb2fccc38 RCX: 00007f51e4613707 [ 243.238153] RDX:
> > 00007f51e46f3501 RSI: 00007ffdb2fcef3e RDI: 00007f51e46ae557 [
> > 243.241063] RBP: 0000000000000001 R08: 0000000000000000 R09:
> > 0000000000000000 [ 243.244028] R10: 00007f51e45be0d0 R11:
> > 0000000000000246 R12: 00005568f2d350fb [ 243.247586] R13:
> > 0000000000000000 R14: 0000000000000000 R15: 00005568f2d37b00 [
> > 243.251622] </TASK>
> >
> > BTW running this script with cache=loose made me realize that Linux 9p
> > client is always requesting server for every "ls foo.txt", which makes it
> > extremely slow unnecessarily.
> >
> >> fs/9p/vfs_inode.c | 7 ++++++-
> >> fs/9p/vfs_inode_dotl.c | 7 ++++++-
> >> 2 files changed, 12 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
> >> index 97abe65bf7c1..bfba21f5d8a9 100644
> >> --- a/fs/9p/vfs_inode.c
> >> +++ b/fs/9p/vfs_inode.c
> >> @@ -993,7 +993,12 @@ v9fs_vfs_getattr(struct mnt_idmap *idmap, const
> >> struct
> >> path *path, if (IS_ERR(st))
> >> return PTR_ERR(st);
> >>
> >> - v9fs_stat2inode(st, d_inode(dentry), dentry->d_sb, 0);
> >> + /*
> >> + * With writeback caching, the client is authoritative for i_size.
> >> + * Don't let the server overwrite it with a potentially stale value.
> >> + */
> >> + v9fs_stat2inode(st, d_inode(dentry), dentry->d_sb,
> >> + (v9ses->cache & CACHE_WRITEBACK) ? V9FS_STAT2INODE_KEEP_ISIZE : 0);
> >> generic_fillattr(&nop_mnt_idmap, request_mask, d_inode(dentry), stat);
> >>
> >> p9stat_free(st);
> >> diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
> >> index 643e759eacb2..67a0ded2e223 100644
> >> --- a/fs/9p/vfs_inode_dotl.c
> >> +++ b/fs/9p/vfs_inode_dotl.c
> >> @@ -453,7 +453,12 @@ v9fs_vfs_getattr_dotl(struct mnt_idmap *idmap,
> >> if (IS_ERR(st))
> >> return PTR_ERR(st);
> >>
> >> - v9fs_stat2inode_dotl(st, d_inode(dentry), 0);
> >> + /*
> >> + * With writeback caching, the client is authoritative for i_size.
> >> + * Don't let the server overwrite it with a potentially stale value.
> >> + */
> >> + v9fs_stat2inode_dotl(st, d_inode(dentry),
> >> + (v9ses->cache & CACHE_WRITEBACK) ? V9FS_STAT2INODE_KEEP_ISIZE : 0);
> >> generic_fillattr(&nop_mnt_idmap, request_mask, d_inode(dentry), stat);
> >> /* Change block size to what the server returned */
> >> stat->blksize = st->st_blksize;
next prev parent reply other threads:[~2026-02-04 11:49 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-27 18:01 [PATCH v2] 9p: fix i_size update race in getattr with writeback caching Pierre Barre
2026-01-05 11:35 ` Christian Schoenebeck
2026-01-07 0:27 ` Pierre Barre
2026-02-04 11:48 ` Christian Schoenebeck [this message]
2026-02-15 9:21 ` Dominique Martinet
2026-02-17 14:48 ` David Howells
2026-02-17 15:05 ` David Howells
2026-02-17 16:54 ` David Howells
2026-02-18 13:38 ` David Howells
2026-02-18 13:41 ` David Howells
2026-02-18 14:11 ` Dominique Martinet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1946294.tdWV9SEqCh@weasel \
--to=linux_oss@crudebyte.com \
--cc=asmadeus@codewreck.org \
--cc=dhowells@redhat.com \
--cc=ericvh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lucho@ionkov.net \
--cc=pierre@barre.sh \
--cc=v9fs@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox