Re: [PATCH] netfs: Fix early read unlock of page with EOF in middle

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Christian Schoenebeck <linux_oss@crudebyte.com>
To: Christian Brauner <brauner@kernel.org>,
	David Howells <dhowells@redhat.com>,
	Dominique Martinet <asmadeus@codewreck.org>
Cc: Eric Van Hensbergen <ericvh@kernel.org>,
	Latchesar Ionkov <lucho@ionkov.net>,
	Chris Arges <carges@cloudflare.com>,
	Matthew Wilcox <willy@infradead.org>,
	Steve French <sfrench@samba.org>,
	v9fs@lists.linux.dev, netfs@lists.linux.dev,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] netfs: Fix early read unlock of page with EOF in middle
Date: Sat, 20 Dec 2025 15:55:09 +0100	[thread overview]
Message-ID: <8618918.T7Z3S40VBb@weasel> (raw)
In-Reply-To: <938162.1766233900@warthog.procyon.org.uk>

On Saturday, 20 December 2025 13:31:40 CET David Howells wrote:
> The read result collection for buffered reads seems to run ahead of the
> completion of subrequests under some circumstances, as can be seen in the
> following log snippet:
> 
>     9p_client_res: client 18446612686390831168 response P9_TREAD tag  0 err
> 0 ...
>     netfs_sreq: R=00001b55[1] DOWN TERM  f=192 s=0 5fb2/5fb2 s=5 e=0
>     ...
>     netfs_collect_folio: R=00001b55 ix=00004 r=4000-5000 t=4000/5fb2
>     netfs_folio: i=157f3 ix=00004-00004 read-done
>     netfs_folio: i=157f3 ix=00004-00004 read-unlock
>     netfs_collect_folio: R=00001b55 ix=00005 r=5000-5fb2 t=5000/5fb2
>     netfs_folio: i=157f3 ix=00005-00005 read-done
>     netfs_folio: i=157f3 ix=00005-00005 read-unlock
>     ...
>     netfs_collect_stream: R=00001b55[0:] cto=5fb2 frn=ffffffff
>     netfs_collect_state: R=00001b55 col=5fb2 cln=6000 n=c
>     netfs_collect_stream: R=00001b55[0:] cto=5fb2 frn=ffffffff
>     netfs_collect_state: R=00001b55 col=5fb2 cln=6000 n=8
>     ...
>     netfs_sreq: R=00001b55[2] ZERO SUBMT f=000 s=5fb2 0/4e s=0 e=0
>     netfs_sreq: R=00001b55[2] ZERO TERM  f=102 s=5fb2 4e/4e s=5 e=0
> 
> The 'cto=5fb2' indicates the collected file pos we've collected results to
> so far - but we still have 0x4e more bytes to go - so we shouldn't have
> collected folio ix=00005 yet.  The 'ZERO' subreq that clears the tail
> happens after we unlock the folio, allowing the application to see the
> uncleared tail through mmap.
> 
> The problem is that netfs_read_unlock_folios() will unlock a folio in which
> the amount of read results collected hits EOF position - but the ZERO
> subreq lies beyond that and so happens after.
> 
> Fix this by changing the end check to always be the end of the folio and
> never the end of the file.
> 
> In the future, I should look at clearing to the end of the folio here rather
> than adding a ZERO subreq to do this.  On the other hand, the ZERO subreq
> can run in parallel with an async READ subreq.  Further, the ZERO subreq
> may still be necessary to, say, handle extents in a ceph file that don't
> have any backing store and are thus implicitly all zeros.
> 
> This can be reproduced by creating a file, the size of which doesn't align
> to a page boundary, e.g. 24998 (0x5fb2) bytes and then doing something
> like:
> 
>     xfs_io -c "mmap -r 0 0x6000" -c "madvise -d 0 0x6000" \
>            -c "mread -v 0 0x6000" /xfstest.test/x
> 
> The last 0x4e bytes should all be 00, but if the tail hasn't been cleared
> yet, you may see rubbish there.  This can be reproduced with kafs by
> modifying the kernel to disable the call to netfs_read_subreq_progress()
> and to stop afs_issue_read() from doing the async call for NETFS_READAHEAD.
> Reproduction can be made easier by inserting an mdelay(100) in
> netfs_issue_read() for the ZERO-subreq case.
> 
> AFS and CIFS are normally unlikely to show this as they dispatch READ ops
> asynchronously, which allows the ZERO-subreq to finish first.  9P's READ op
> is completely synchronous, so the ZERO-subreq will always happen after.  It
> isn't seen all the time, though, because the collection may be done in a
> worker thread.
> 
> Reported-by: Christian Schoenebeck <linux_oss@crudebyte.com>
> Link: https://lore.kernel.org/r/8622834.T7Z3S40VBb@weasel/
> Signed-off-by: David Howells <dhowells@redhat.com>
> Suggested-by: Dominique Martinet <asmadeus@codewreck.org>
> cc: Dominique Martinet <asmadeus@codewreck.org>
> cc: Christian Schoenebeck <linux_oss@crudebyte.com>
> cc: v9fs@lists.linux.dev
> cc: netfs@lists.linux.dev
> cc: linux-fsdevel@vger.kernel.org
> ---

I had bisected this mmap() data corruption to e2d46f2ec332 ("netfs: Change the 
read result collector to only use one work item"). So maybe adding a Fixes: 
tag for this as suggested by Dominique?

With the patch applied, this issue disappeared. Give me some hours for more 
thorough tests, due to the random factor involved.

>  fs/netfs/read_collect.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
> index a95e7aadafd0..7a0ffa675fb1 100644
> --- a/fs/netfs/read_collect.c
> +++ b/fs/netfs/read_collect.c
> @@ -137,7 +137,7 @@ static void netfs_read_unlock_folios(struct
> netfs_io_request *rreq, rreq->front_folio_order = order;
>  		fsize = PAGE_SIZE << order;
>  		fpos = folio_pos(folio);
> -		fend = umin(fpos + fsize, rreq->i_size);
> +		fend = fpos + fsize;
> 
>  		trace_netfs_collect_folio(rreq, folio, fend, 
collected_to);

What about write_collect.c side, is it safe as is?

/Christian

next prev parent reply	other threads:[~2025-12-20 14:55 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-20 12:31 [PATCH] netfs: Fix early read unlock of page with EOF in middle David Howells
2025-12-20 14:55 ` Christian Schoenebeck [this message]
2025-12-20 15:17   ` David Howells
2025-12-20 23:50   ` Christian Schoenebeck
2025-12-20 23:54   ` Dominique Martinet
2025-12-24 12:31 ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8618918.T7Z3S40VBb@weasel \
    --to=linux_oss@crudebyte.com \
    --cc=asmadeus@codewreck.org \
    --cc=brauner@kernel.org \
    --cc=carges@cloudflare.com \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucho@ionkov.net \
    --cc=netfs@lists.linux.dev \
    --cc=sfrench@samba.org \
    --cc=v9fs@lists.linux.dev \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.