public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Christian Brauner <christian@brauner.io>
Cc: David Howells <dhowells@redhat.com>,
	Paulo Alcantara <pc@manguebit.org>,
	netfs@lists.linux.dev, linux-afs@lists.infradead.org,
	linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>
Subject: [PATCH v5 13/24] netfs: Fix streaming write being overwritten
Date: Tue, 28 Apr 2026 14:17:43 +0100	[thread overview]
Message-ID: <20260428131756.922303-14-dhowells@redhat.com> (raw)
In-Reply-To: <20260428131756.922303-1-dhowells@redhat.com>

In order to avoid reading whilst writing, netfslib will allow "streaming
writes" in which dirty data is stored directly into folios without reading
them first.  Such folios are marked dirty but may not be marked uptodate.
If a folio is entirely written by a streaming write, uptodate will be set,
otherwise it will have a netfs_folio struct attached to ->private recording
the dirty region.

In the event that a partially written streaming write page is to be
overwritten entirely by a single write(), netfs_perform_write() will try to
copy over it, but doesn't discard the netfs_folio if it succeeds; further,
it doesn't correctly handle a partial copy that overwrites some of the
dirty data.

Fix this by the following:

 (1) If the folio is successfully overwritten, free the netfs_folio struct
     before marking the page uptodate.

 (2) If the copy to the folio partially fails, but short of the dirty data,
     just ignore the copy.

 (3) If the copy partially fails and overwrites some of the dirty data,
     accept the copy, update the netfs_folio struct to record the new data.
     If the folio is now filled, free the netfs_folio and set uptodate,
     otherwise return a partial write.

Found with:

	fsx -q -N 1000000 -p 10000 -o 128000 -l 600000 \
	  /xfstest.test/junk --replay-ops=junk.fsxops

using the following as junk.fsxops:

	truncate 0x0 0 0x927c0
	write 0x63fb8 0x53c8 0
	copy_range 0xb704 0x19b9 0x24429 0x79380
	write 0x2402b 0x144a2 0x90660 *
	write 0x204d5 0x140a0 0x927c0 *
	copy_range 0x1f72c 0x137d0 0x7a906 0x927c0 *
	read 0x00000 0x20000 0x9157c
	read 0x20000 0x20000 0x9157c
	read 0x40000 0x20000 0x9157c
	read 0x60000 0x20000 0x9157c
	read 0x7e1a0 0xcfb9 0x9157c

on cifs with the default cache option.

It shows folio 0x24 misbehaving if the FMODE_READ check is commented out in
netfs_perform_write():

		if (//(file->f_mode & FMODE_READ) ||
		    netfs_is_cache_enabled(ctx)) {

and no fscache.  This was initially found with the generic/522 xfstest.

Fixes: 8f52de0077ba ("netfs: Reduce number of conditional branches in netfs_perform_write()")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
---
 fs/netfs/buffered_write.c    | 47 ++++++++++++++++++++++++++----------
 include/trace/events/netfs.h |  3 +++
 2 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 7ac128d0b4e5..25571a570ac9 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -246,18 +246,38 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
 		/* See if we can write a whole folio in one go. */
 		if (!maybe_trouble && offset == 0 && part >= flen) {
 			copied = copy_folio_from_iter_atomic(folio, offset, part, iter);
-			if (unlikely(copied == 0))
+			if (likely(copied == part)) {
+				if (finfo) {
+					trace = netfs_whole_folio_modify_filled;
+					goto folio_now_filled;
+				}
+				__netfs_set_group(folio, netfs_group);
+				folio_mark_uptodate(folio);
+				trace = netfs_whole_folio_modify;
+				goto copied;
+			}
+			if (copied == 0)
 				goto copy_failed;
-			if (unlikely(copied < part)) {
+			if (!finfo || copied <= finfo->dirty_offset) {
 				maybe_trouble = true;
 				iov_iter_revert(iter, copied);
 				copied = 0;
 				folio_unlock(folio);
 				goto retry;
 			}
-			__netfs_set_group(folio, netfs_group);
-			folio_mark_uptodate(folio);
-			trace = netfs_whole_folio_modify;
+
+			/* We overwrote some existing dirty data, so we have to
+			 * accept the partial write.
+			 */
+			finfo->dirty_len += finfo->dirty_offset;
+			if (finfo->dirty_len == flen) {
+				trace = netfs_whole_folio_modify_filled_efault;
+				goto folio_now_filled;
+			}
+			if (copied > finfo->dirty_len)
+				finfo->dirty_len = copied;
+			finfo->dirty_offset = 0;
+			trace = netfs_whole_folio_modify_efault;
 			goto copied;
 		}
 
@@ -327,16 +347,10 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
 				goto copy_failed;
 			finfo->dirty_len += copied;
 			if (finfo->dirty_offset == 0 && finfo->dirty_len == flen) {
-				if (finfo->netfs_group)
-					folio_change_private(folio, finfo->netfs_group);
-				else
-					folio_detach_private(folio);
-				folio_mark_uptodate(folio);
-				kfree(finfo);
 				trace = netfs_streaming_cont_filled_page;
-			} else {
-				trace = netfs_streaming_write_cont;
+				goto folio_now_filled;
 			}
+			trace = netfs_streaming_write_cont;
 			goto copied;
 		}
 
@@ -350,6 +364,13 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
 			goto out;
 		continue;
 
+	folio_now_filled:
+		if (finfo->netfs_group)
+			folio_change_private(folio, finfo->netfs_group);
+		else
+			folio_detach_private(folio);
+		folio_mark_uptodate(folio);
+		kfree(finfo);
 	copied:
 		trace_netfs_folio(folio, trace);
 		flush_dcache_folio(folio);
diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index 0b702f74aefe..aa9940ba307b 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -177,6 +177,9 @@
 	EM(netfs_folio_is_uptodate,		"mod-uptodate")	\
 	EM(netfs_just_prefetch,			"mod-prefetch")	\
 	EM(netfs_whole_folio_modify,		"mod-whole-f")	\
+	EM(netfs_whole_folio_modify_efault,	"mod-whole-f!")	\
+	EM(netfs_whole_folio_modify_filled,	"mod-whole-f+")	\
+	EM(netfs_whole_folio_modify_filled_efault, "mod-whole-f+!") \
 	EM(netfs_modify_and_clear,		"mod-n-clear")	\
 	EM(netfs_streaming_write,		"mod-streamw")	\
 	EM(netfs_streaming_write_cont,		"mod-streamw+")	\


  parent reply	other threads:[~2026-04-28 13:19 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 13:17 [PATCH v5 00/24] netfs: Miscellaneous fixes David Howells
2026-04-28 13:17 ` [PATCH v5 01/24] netfs: Fix cancellation of a DIO and single read subrequests David Howells
2026-04-28 13:17 ` [PATCH v5 02/24] netfs: Fix missing locking around retry adding new subreqs David Howells
2026-04-28 13:17 ` [PATCH v5 03/24] netfs: Fix missing barriers when accessing stream->subrequests locklessly David Howells
2026-04-28 13:17 ` [PATCH v5 04/24] netfs: Fix netfs_read_to_pagecache() to pause on subreq failure David Howells
2026-04-28 13:17 ` [PATCH v5 05/24] netfs: Fix potential for tearing in ->remote_i_size and ->zero_point David Howells
2026-04-28 13:17 ` [PATCH v5 06/24] netfs: Fix zeropoint update where i_size > remote_i_size David Howells
2026-04-28 13:17 ` [PATCH v5 07/24] netfs: fix VM_BUG_ON_FOLIO() issue in netfs_write_begin() call David Howells
2026-04-28 13:17 ` [PATCH v5 08/24] netfs: fix error handling in netfs_extract_user_iter() David Howells
2026-04-28 13:17 ` [PATCH v5 09/24] netfs: Fix overrun check " David Howells
2026-04-28 13:17 ` [PATCH v5 10/24] netfs: Fix potential uninitialised var " David Howells
2026-04-28 13:17 ` [PATCH v5 11/24] netfs: Fix netfs_invalidate_folio() to clear dirty bit if all changes gone David Howells
2026-04-28 13:17 ` [PATCH v5 12/24] netfs: Defer the emission of trace_netfs_folio() David Howells
2026-04-28 13:17 ` David Howells [this message]
2026-04-28 13:17 ` [PATCH v5 14/24] netfs: Fix potential deadlock in write-through mode David Howells
2026-04-28 13:17 ` [PATCH v5 15/24] netfs: Fix read-gaps to remove netfs_folio from filled folio David Howells
2026-04-28 13:17 ` [PATCH v5 16/24] netfs: Fix write streaming disablement if fd open O_RDWR David Howells
2026-04-28 13:17 ` [PATCH v5 17/24] netfs: Fix early put of sink folio in netfs_read_gaps() David Howells
2026-04-28 13:17 ` [PATCH v5 18/24] netfs: Fix leak of request in netfs_write_begin() error handling David Howells
2026-04-28 13:17 ` [PATCH v5 19/24] netfs: Fix potential UAF in netfs_unlock_abandoned_read_pages() David Howells
2026-04-28 13:17 ` [PATCH v5 20/24] netfs: Fix partial invalidation of streaming-write folio David Howells
2026-04-28 13:17 ` [PATCH v5 21/24] netfs: Fix folio->private handling in netfs_perform_write() David Howells
2026-04-28 13:17 ` [PATCH v5 22/24] netfs: Fix netfs_read_folio() to wait on writeback David Howells
2026-04-28 13:17 ` [PATCH v5 23/24] netfs, afs: Fix write skipping in dir/link writepages David Howells
2026-04-28 13:17 ` [PATCH v5 24/24] afs: Fix the locking used by afs_get_link() David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260428131756.922303-14-dhowells@redhat.com \
    --to=dhowells@redhat.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=christian@brauner.io \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netfs@lists.linux.dev \
    --cc=pc@manguebit.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox