Re: [PATCH] netfs: Fix early read unlock of page with EOF in middle

public inbox for netfs@lists.linux.dev
 help / color / mirror / Atom feed

From: Christian Schoenebeck <linux_oss@crudebyte.com>
To: Christian Brauner <brauner@kernel.org>,
	David Howells <dhowells@redhat.com>,
	Dominique Martinet <asmadeus@codewreck.org>
Cc: Eric Van Hensbergen <ericvh@kernel.org>,
	Latchesar Ionkov <lucho@ionkov.net>,
	Chris Arges <carges@cloudflare.com>,
	Matthew Wilcox <willy@infradead.org>,
	Steve French <sfrench@samba.org>,
	v9fs@lists.linux.dev, netfs@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] netfs: Fix early read unlock of page with EOF in middle
Date: Sat, 20 Dec 2025 15:55:09 +0100	[thread overview]
Message-ID: <8618918.T7Z3S40VBb@weasel> (raw)
In-Reply-To: <938162.1766233900@warthog.procyon.org.uk>

On Saturday, 20 December 2025 13:31:40 CET David Howells wrote:
> The read result collection for buffered reads seems to run ahead of the
> completion of subrequests under some circumstances, as can be seen in the
> following log snippet:
> 
>     9p_client_res: client 18446612686390831168 response P9_TREAD tag  0 err
> 0 ...
>     netfs_sreq: R=00001b55[1] DOWN TERM  f=192 s=0 5fb2/5fb2 s=5 e=0
>     ...
>     netfs_collect_folio: R=00001b55 ix=00004 r=4000-5000 t=4000/5fb2
>     netfs_folio: i=157f3 ix=00004-00004 read-done
>     netfs_folio: i=157f3 ix=00004-00004 read-unlock
>     netfs_collect_folio: R=00001b55 ix=00005 r=5000-5fb2 t=5000/5fb2
>     netfs_folio: i=157f3 ix=00005-00005 read-done
>     netfs_folio: i=157f3 ix=00005-00005 read-unlock
>     ...
>     netfs_collect_stream: R=00001b55[0:] cto=5fb2 frn=ffffffff
>     netfs_collect_state: R=00001b55 col=5fb2 cln=6000 n=c
>     netfs_collect_stream: R=00001b55[0:] cto=5fb2 frn=ffffffff
>     netfs_collect_state: R=00001b55 col=5fb2 cln=6000 n=8
>     ...
>     netfs_sreq: R=00001b55[2] ZERO SUBMT f=000 s=5fb2 0/4e s=0 e=0
>     netfs_sreq: R=00001b55[2] ZERO TERM  f=102 s=5fb2 4e/4e s=5 e=0
> 
> The 'cto=5fb2' indicates the collected file pos we've collected results to
> so far - but we still have 0x4e more bytes to go - so we shouldn't have
> collected folio ix=00005 yet.  The 'ZERO' subreq that clears the tail
> happens after we unlock the folio, allowing the application to see the
> uncleared tail through mmap.
> 
> The problem is that netfs_read_unlock_folios() will unlock a folio in which
> the amount of read results collected hits EOF position - but the ZERO
> subreq lies beyond that and so happens after.
> 
> Fix this by changing the end check to always be the end of the folio and
> never the end of the file.
> 
> In the future, I should look at clearing to the end of the folio here rather
> than adding a ZERO subreq to do this.  On the other hand, the ZERO subreq
> can run in parallel with an async READ subreq.  Further, the ZERO subreq
> may still be necessary to, say, handle extents in a ceph file that don't
> have any backing store and are thus implicitly all zeros.
> 
> This can be reproduced by creating a file, the size of which doesn't align
> to a page boundary, e.g. 24998 (0x5fb2) bytes and then doing something
> like:
> 
>     xfs_io -c "mmap -r 0 0x6000" -c "madvise -d 0 0x6000" \
>            -c "mread -v 0 0x6000" /xfstest.test/x
> 
> The last 0x4e bytes should all be 00, but if the tail hasn't been cleared
> yet, you may see rubbish there.  This can be reproduced with kafs by
> modifying the kernel to disable the call to netfs_read_subreq_progress()
> and to stop afs_issue_read() from doing the async call for NETFS_READAHEAD.
> Reproduction can be made easier by inserting an mdelay(100) in
> netfs_issue_read() for the ZERO-subreq case.
> 
> AFS and CIFS are normally unlikely to show this as they dispatch READ ops
> asynchronously, which allows the ZERO-subreq to finish first.  9P's READ op
> is completely synchronous, so the ZERO-subreq will always happen after.  It
> isn't seen all the time, though, because the collection may be done in a
> worker thread.
> 
> Reported-by: Christian Schoenebeck <linux_oss@crudebyte.com>
> Link: https://lore.kernel.org/r/8622834.T7Z3S40VBb@weasel/
> Signed-off-by: David Howells <dhowells@redhat.com>
> Suggested-by: Dominique Martinet <asmadeus@codewreck.org>
> cc: Dominique Martinet <asmadeus@codewreck.org>
> cc: Christian Schoenebeck <linux_oss@crudebyte.com>
> cc: v9fs@lists.linux.dev
> cc: netfs@lists.linux.dev
> cc: linux-fsdevel@vger.kernel.org
> ---

I had bisected this mmap() data corruption to e2d46f2ec332 ("netfs: Change the 
read result collector to only use one work item"). So maybe adding a Fixes: 
tag for this as suggested by Dominique?

With the patch applied, this issue disappeared. Give me some hours for more 
thorough tests, due to the random factor involved.

>  fs/netfs/read_collect.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
> index a95e7aadafd0..7a0ffa675fb1 100644
> --- a/fs/netfs/read_collect.c
> +++ b/fs/netfs/read_collect.c
> @@ -137,7 +137,7 @@ static void netfs_read_unlock_folios(struct
> netfs_io_request *rreq, rreq->front_folio_order = order;
>  		fsize = PAGE_SIZE << order;
>  		fpos = folio_pos(folio);
> -		fend = umin(fpos + fsize, rreq->i_size);
> +		fend = fpos + fsize;
> 
>  		trace_netfs_collect_folio(rreq, folio, fend, 
collected_to);

What about write_collect.c side, is it safe as is?

/Christian

next prev parent reply	other threads:[~2025-12-20 14:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-20 12:31 [PATCH] netfs: Fix early read unlock of page with EOF in middle David Howells
2025-12-20 14:55 ` Christian Schoenebeck [this message]
2025-12-20 15:17   ` David Howells
2025-12-20 23:50   ` Christian Schoenebeck
2025-12-20 23:54   ` Dominique Martinet
2025-12-24 12:31 ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8618918.T7Z3S40VBb@weasel \
    --to=linux_oss@crudebyte.com \
    --cc=asmadeus@codewreck.org \
    --cc=brauner@kernel.org \
    --cc=carges@cloudflare.com \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucho@ionkov.net \
    --cc=netfs@lists.linux.dev \
    --cc=sfrench@samba.org \
    --cc=v9fs@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox