From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from kylie.crudebyte.com (kylie.crudebyte.com [5.189.157.229]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7E6F28E59E; Sat, 20 Dec 2025 14:55:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=5.189.157.229 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766242543; cv=none; b=RnuluZhvio/pDmRP8dBZmy08RPq8dJ8ZZtm66BR8nCuSQhmDsuIDGM0n8RPnD30tg6Rj3urLiXoL1RCUgy+eKB69PFoun2YShfWJ5TETLB6hEmyhdxQGIFX24HLA33WsxbfNj1p0kuZs+kZiF7ehhYxO9mRnTeclw2DsQlmWqhk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766242543; c=relaxed/simple; bh=KtzVQdZaMr95ioM5Kw3aMHD+e5gqVZPsillYzrYIT2M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=NO4t2CnS2twrHhnTlZ12V7wV3UcS+Uc+5RC9ltqw9PqK4U0Ths4XAEBVrBkewWen79WSpSuKTZDgbZ0Ayq9EsAXHd1WcVOLXRi+A3IdXSc1e46VGFcYooSvBsBbiIwfldWLpN5bDhHJZfHFClQFFm66bbAZF2wryF8dN0Y0lBak= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=crudebyte.com; spf=pass smtp.mailfrom=crudebyte.com; dkim=pass (4096-bit key) header.d=crudebyte.com header.i=@crudebyte.com header.b=vu84rKGN; arc=none smtp.client-ip=5.189.157.229 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=crudebyte.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=crudebyte.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (4096-bit key) header.d=crudebyte.com header.i=@crudebyte.com header.b="vu84rKGN" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=kylie; h=Content-Type:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Content-ID:Content-Description; bh=1oMl2sGh2+KLRcLUi/Ckq52w8adkpPfv++Qj3e/rMCU=; b=vu84rKGNUGSSpfqZqLuwJ5wv79 6gmOwHsc/E3w9W+1zMye8INJM/crkGlASEOYMMiSJb09wi6yBjHMOY2FGuX5bAjrfZIa1184tyA/K U+rHxqoIBI2yJq98oScJ67QxhTUFQk4+/zid+MKBhRHqMoL9UEYxPCdlQ2K8KrQTX0/nm8y5F7fci bh3hJqbJi0Fy7qnHdfK5Aj2BENuerxZUOFyk9YW4sPmSYFLhxErDq43ghp4Dh5VgyLmPrKj1lY9jN pG9R11L0mXbN6d5s0zRAwwLaRBXTQPZ2HveD+VcF4/YqjIK4/HYQzB5ZIe168baASqm8laiLaqE31 ekYTxyCpdOsvcsdwOjsg4edKcsEfaKEN3Df9MaDpP+CjhWNShSiAFPKMPxUGbOLorNjikUWVi6mLM HIXbQbLtpYkbG5iokJJdz5iC32zhFgN/FzM2t+t0siPGa4ZfExIpnNEC3Aq76AVDHIRdWZtal3U8/ YlWwhb5XhbaNlK6InneEBaN6ts4QBySVvSbWv79YrbvCNW1GiysgQVZOi9t00ZXHQL6g0QvuQgbWv vwq2TcZI0NtyIkGjxDycLvuUefNwkn9ppfnpuqRTzX7TcnhxZ25iKVkW+OD+vpTyUhm/KtMhCEF9w KtbBe1c7gdF7XBWZr3CMNA2a2AFoZjXqPiNTFEIbE=; From: Christian Schoenebeck To: Christian Brauner , David Howells , Dominique Martinet Cc: Eric Van Hensbergen , Latchesar Ionkov , Chris Arges , Matthew Wilcox , Steve French , v9fs@lists.linux.dev, netfs@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] netfs: Fix early read unlock of page with EOF in middle Date: Sat, 20 Dec 2025 15:55:09 +0100 Message-ID: <8618918.T7Z3S40VBb@weasel> In-Reply-To: <938162.1766233900@warthog.procyon.org.uk> References: <938162.1766233900@warthog.procyon.org.uk> Precedence: bulk X-Mailing-List: netfs@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="utf-8" On Saturday, 20 December 2025 13:31:40 CET David Howells wrote: > The read result collection for buffered reads seems to run ahead of the > completion of subrequests under some circumstances, as can be seen in the > following log snippet: > > 9p_client_res: client 18446612686390831168 response P9_TREAD tag 0 err > 0 ... > netfs_sreq: R=00001b55[1] DOWN TERM f=192 s=0 5fb2/5fb2 s=5 e=0 > ... > netfs_collect_folio: R=00001b55 ix=00004 r=4000-5000 t=4000/5fb2 > netfs_folio: i=157f3 ix=00004-00004 read-done > netfs_folio: i=157f3 ix=00004-00004 read-unlock > netfs_collect_folio: R=00001b55 ix=00005 r=5000-5fb2 t=5000/5fb2 > netfs_folio: i=157f3 ix=00005-00005 read-done > netfs_folio: i=157f3 ix=00005-00005 read-unlock > ... > netfs_collect_stream: R=00001b55[0:] cto=5fb2 frn=ffffffff > netfs_collect_state: R=00001b55 col=5fb2 cln=6000 n=c > netfs_collect_stream: R=00001b55[0:] cto=5fb2 frn=ffffffff > netfs_collect_state: R=00001b55 col=5fb2 cln=6000 n=8 > ... > netfs_sreq: R=00001b55[2] ZERO SUBMT f=000 s=5fb2 0/4e s=0 e=0 > netfs_sreq: R=00001b55[2] ZERO TERM f=102 s=5fb2 4e/4e s=5 e=0 > > The 'cto=5fb2' indicates the collected file pos we've collected results to > so far - but we still have 0x4e more bytes to go - so we shouldn't have > collected folio ix=00005 yet. The 'ZERO' subreq that clears the tail > happens after we unlock the folio, allowing the application to see the > uncleared tail through mmap. > > The problem is that netfs_read_unlock_folios() will unlock a folio in which > the amount of read results collected hits EOF position - but the ZERO > subreq lies beyond that and so happens after. > > Fix this by changing the end check to always be the end of the folio and > never the end of the file. > > In the future, I should look at clearing to the end of the folio here rather > than adding a ZERO subreq to do this. On the other hand, the ZERO subreq > can run in parallel with an async READ subreq. Further, the ZERO subreq > may still be necessary to, say, handle extents in a ceph file that don't > have any backing store and are thus implicitly all zeros. > > This can be reproduced by creating a file, the size of which doesn't align > to a page boundary, e.g. 24998 (0x5fb2) bytes and then doing something > like: > > xfs_io -c "mmap -r 0 0x6000" -c "madvise -d 0 0x6000" \ > -c "mread -v 0 0x6000" /xfstest.test/x > > The last 0x4e bytes should all be 00, but if the tail hasn't been cleared > yet, you may see rubbish there. This can be reproduced with kafs by > modifying the kernel to disable the call to netfs_read_subreq_progress() > and to stop afs_issue_read() from doing the async call for NETFS_READAHEAD. > Reproduction can be made easier by inserting an mdelay(100) in > netfs_issue_read() for the ZERO-subreq case. > > AFS and CIFS are normally unlikely to show this as they dispatch READ ops > asynchronously, which allows the ZERO-subreq to finish first. 9P's READ op > is completely synchronous, so the ZERO-subreq will always happen after. It > isn't seen all the time, though, because the collection may be done in a > worker thread. > > Reported-by: Christian Schoenebeck > Link: https://lore.kernel.org/r/8622834.T7Z3S40VBb@weasel/ > Signed-off-by: David Howells > Suggested-by: Dominique Martinet > cc: Dominique Martinet > cc: Christian Schoenebeck > cc: v9fs@lists.linux.dev > cc: netfs@lists.linux.dev > cc: linux-fsdevel@vger.kernel.org > --- I had bisected this mmap() data corruption to e2d46f2ec332 ("netfs: Change the read result collector to only use one work item"). So maybe adding a Fixes: tag for this as suggested by Dominique? With the patch applied, this issue disappeared. Give me some hours for more thorough tests, due to the random factor involved. > fs/netfs/read_collect.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c > index a95e7aadafd0..7a0ffa675fb1 100644 > --- a/fs/netfs/read_collect.c > +++ b/fs/netfs/read_collect.c > @@ -137,7 +137,7 @@ static void netfs_read_unlock_folios(struct > netfs_io_request *rreq, rreq->front_folio_order = order; > fsize = PAGE_SIZE << order; > fpos = folio_pos(folio); > - fend = umin(fpos + fsize, rreq->i_size); > + fend = fpos + fsize; > > trace_netfs_collect_folio(rreq, folio, fend, collected_to); What about write_collect.c side, is it safe as is? /Christian