From: Christian Schoenebeck <linux_oss@crudebyte.com>
To: Pierre Barre <pierre@barre.sh>,
Dominique Martinet <asmadeus@codewreck.org>
Cc: ericvh@kernel.org, lucho@ionkov.net, v9fs@lists.linux.dev,
linux-kernel@vger.kernel.org, David Howells <dhowells@redhat.com>
Subject: Re: [BUG] 9p: data corruption with cache=mmap under concurrent stat/write
Date: Thu, 25 Dec 2025 11:23:20 +0100 [thread overview]
Message-ID: <5951517.DvuYhMxLoT@weasel> (raw)
In-Reply-To: <aUxqVn0DO7ee9K2_@codewreck.org>
On Wednesday, 24 December 2025 23:33:58 CET Dominique Martinet wrote:
> Hi Pierre,
>
> Pierre Barre wrote on Wed, Dec 24, 2025 at 03:29:01PM +0100:
> > I'm hitting data corruption using 9p with cache=mmap when stat() is called
> > concurrently with writes.
> Thanks for the report
>
> > Environment:
> > - Kernel: v6.18.1-061801
> > - Mount options: cache=mmap
> > - Transport: unix
> >
> > Reproducer:
> > 1. Mount 9p filesystem with cache=mmap
> > 2. Run PostgreSQL with data directory on 9p mount
> > 3. Run pgbench workload
> > 4. Simultaneously run `watch -n 0.1 tree -ah` on the data directory
> >
> > PostgreSQL reports:
> > ERROR: unexpected data beyond EOF in block N of relation "..."
>
> unexpected data beyond EOF looks a lot like
> https://lkml.kernel.org/r/938162.1766233900@warthog.procyon.org.uk
>
> could you try with this patch?
Pierre, I am also confident that this patch will fix the EOF data issue you
encountered with PostgreSQL. However ...
> > HINT: This has been seen to occur with buggy kernels
> >
> > Analysis:
> >
> > The issue appears to be race conditions in getattr/setattr when using
> > writeback caching:
> >
> > 1. v9fs_vfs_getattr_dotl() condition checks `v9ses->cache` instead of
> >
> > `v9ses->cache & CACHE_WRITEBACK`, triggering writeback flush for
> > any cache mode
> >
> > 2. Both getattr and setattr call filemap_fdatawrite() which initiates
> >
> > writeback but doesn't wait for completion. The subsequent server
> > stat/wstat sees stale file size.
> >
> > Would using filemap_write_and_wait() instead be the correct fix?
... you are seeing a 2nd issue? getattr() output should not be related to
mmap() access.
/Christian
next prev parent reply other threads:[~2025-12-25 10:23 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-24 14:29 [BUG] 9p: data corruption with cache=mmap under concurrent stat/write Pierre Barre
2025-12-24 22:33 ` Dominique Martinet
2025-12-25 10:23 ` Christian Schoenebeck [this message]
2025-12-25 14:52 ` Pierre Barre
2025-12-26 13:13 ` Pierre Barre
2026-01-05 7:54 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5951517.DvuYhMxLoT@weasel \
--to=linux_oss@crudebyte.com \
--cc=asmadeus@codewreck.org \
--cc=dhowells@redhat.com \
--cc=ericvh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lucho@ionkov.net \
--cc=pierre@barre.sh \
--cc=v9fs@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.