From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59A153A783A for ; Tue, 14 Apr 2026 08:20:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776154845; cv=none; b=tkjSQYg7Sd0mGD9l8m5NPl9uhgp0PgMexT1+9AU2FVxQWlmcUMK3VzOZL8UIBmcGtVjg55VB9/23cGYJpVRp/6/lhOrDQpq52byZNvvH79pDT39na3SqphxYMJRpmmKBL/e8qqBf3W8sxWYwHyPI2JoogSQOD6+0T42HKej+G5c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776154845; c=relaxed/simple; bh=PRopKs/FT3nkkdMATgLJWk4AoF7GfLfM0RyRT2gJIBI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OweVCBOzusBqzIF0GhwRrsCuI+OPJ9+nziUAVihML1tpqD9e0CrpCQ29aNlhQm7+oJ3B+CQhoNUqWjUwzSoesgIHU2KriO6sORmFr3f8VYeSJU+ElIgWcxaly7eXFXerQAtjudiOmpRH6mAly4GAqyWbp9u4mC6lJ54elV4Zyf4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=F/8zzzXG; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="F/8zzzXG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1776154841; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dDZpiVNLePhW4nt96um0+f3XwBnHWbaZ7t0vp4dXfAg=; b=F/8zzzXGcULnWOGPROkUOff/mcP1rmWyB/jaz7ljdsvn9d3OqRmxpxKWg83bGwcmom/JaC qLIIrxUxLLmv8Swab0DbnPtwVVVO8oP+g8wbVqw05SLX8uTswWvk77cbNbiTMJRr304kPK xNFxrI4YoHAEFWTiYOhT6hyva/L9fC4= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-650-ts9cqkY6Op-t9d0YRDzrWw-1; Tue, 14 Apr 2026 04:20:35 -0400 X-MC-Unique: ts9cqkY6Op-t9d0YRDzrWw-1 X-Mimecast-MFC-AGG-ID: ts9cqkY6Op-t9d0YRDzrWw_1776154834 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7562D19560A7; Tue, 14 Apr 2026 08:20:33 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.44.34.160]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1F6AB1800B7F; Tue, 14 Apr 2026 08:20:29 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Paulo Alcantara , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Paulo Alcantara , Matthew Wilcox Subject: [PATCH v2 4/7] netfs: Fix streaming write being overwritten Date: Tue, 14 Apr 2026 09:20:00 +0100 Message-ID: <20260414082004.3756080-5-dhowells@redhat.com> In-Reply-To: <20260414082004.3756080-1-dhowells@redhat.com> References: <20260414082004.3756080-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 In order to avoid reading whilst writing, netfslib will allow "streaming writes" in which dirty data is stored directly into folios without reading them first. Such folios are marked dirty but may not be marked uptodate. If a folio is entirely written by a streaming write, uptodate will be set, otherwise it will have a netfs_folio struct attached to ->private recording the dirty region. In the event that a partially written streaming write page is to be overwritten entirely by a single write(), netfs_perform_write() will try to copy over it, but doesn't discard the netfs_folio if it succeeds; further, it doesn't correctly handle a partial copy that overwrites some of the dirty data. Fix this by the following: (1) If the folio is successfully overwritten, free the netfs_folio struct before marking the page uptodate. (2) If the copy to the folio partially fails, but short of the dirty data, just ignore the copy. (3) If the copy partially fails and overwrites some of the dirty data, accept the copy, update the netfs_folio struct to record the new data. If the folio is now filled, free the netfs_folio and set uptodate, otherwise return a partial write. Found with: fsx -q -N 1000000 -p 10000 -o 128000 -l 600000 \ /xfstest.test/junk --replay-ops=junk.fsxops using the following as junk.fsxops: truncate 0x0 0 0x927c0 write 0x63fb8 0x53c8 0 copy_range 0xb704 0x19b9 0x24429 0x79380 write 0x2402b 0x144a2 0x90660 * write 0x204d5 0x140a0 0x927c0 * copy_range 0x1f72c 0x137d0 0x7a906 0x927c0 * read 0x00000 0x20000 0x9157c read 0x20000 0x20000 0x9157c read 0x40000 0x20000 0x9157c read 0x60000 0x20000 0x9157c read 0x7e1a0 0xcfb9 0x9157c on cifs with the default cache option. It shows folio 0x24 misbehaving if the FMODE_READ check is commented out in netfs_perform_write(): if (//(file->f_mode & FMODE_READ) || netfs_is_cache_enabled(ctx)) { and no fscache. This was initially found with the generic/522 xfstest. Signed-off-by: David Howells cc: Paulo Alcantara cc: Matthew Wilcox cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org --- fs/netfs/buffered_write.c | 48 ++++++++++++++++++++++++------------ include/trace/events/netfs.h | 1 + 2 files changed, 33 insertions(+), 16 deletions(-) diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c index 22a4d61631c9..830f1603478e 100644 --- a/fs/netfs/buffered_write.c +++ b/fs/netfs/buffered_write.c @@ -246,18 +246,34 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, /* See if we can write a whole folio in one go. */ if (!maybe_trouble && offset == 0 && part >= flen) { copied = copy_folio_from_iter_atomic(folio, offset, part, iter); - if (unlikely(copied == 0)) + if (likely(copied == part)) { + if (finfo) + goto folio_now_filled; + __netfs_set_group(folio, netfs_group); + folio_mark_uptodate(folio); + trace_netfs_folio(folio, netfs_whole_folio_modify); + goto copied; + } + if (copied == 0) goto copy_failed; - if (unlikely(copied < part)) { + if (!finfo || copied <= finfo->dirty_offset) { maybe_trouble = true; iov_iter_revert(iter, copied); copied = 0; folio_unlock(folio); goto retry; } - __netfs_set_group(folio, netfs_group); - folio_mark_uptodate(folio); - trace_netfs_folio(folio, netfs_whole_folio_modify); + + /* We overwrote some existing dirty data, so we have to + * accept the partial write. + */ + finfo->dirty_len += finfo->dirty_offset; + if (finfo->dirty_len == flen) + goto folio_now_filled; + if (copied > finfo->dirty_len) + finfo->dirty_len = copied; + finfo->dirty_offset = 0; + trace_netfs_folio(folio, netfs_whole_folio_modify_efault); goto copied; } @@ -326,17 +342,9 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, if (unlikely(copied == 0)) goto copy_failed; finfo->dirty_len += copied; - if (finfo->dirty_offset == 0 && finfo->dirty_len == flen) { - if (finfo->netfs_group) - folio_change_private(folio, finfo->netfs_group); - else - folio_detach_private(folio); - folio_mark_uptodate(folio); - kfree(finfo); - trace_netfs_folio(folio, netfs_streaming_cont_filled_page); - } else { - trace_netfs_folio(folio, netfs_streaming_write_cont); - } + if (finfo->dirty_offset == 0 && finfo->dirty_len == flen) + goto folio_now_filled; + trace_netfs_folio(folio, netfs_streaming_write_cont); goto copied; } @@ -350,6 +358,14 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, goto out; continue; + folio_now_filled: + if (finfo->netfs_group) + folio_change_private(folio, finfo->netfs_group); + else + folio_detach_private(folio); + folio_mark_uptodate(folio); + kfree(finfo); + trace_netfs_folio(folio, netfs_streaming_cont_filled_page); copied: flush_dcache_folio(folio); diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h index 88d814ba1e69..1dc1688edc19 100644 --- a/include/trace/events/netfs.h +++ b/include/trace/events/netfs.h @@ -177,6 +177,7 @@ EM(netfs_folio_is_uptodate, "mod-uptodate") \ EM(netfs_just_prefetch, "mod-prefetch") \ EM(netfs_whole_folio_modify, "mod-whole-f") \ + EM(netfs_whole_folio_modify_efault, "mod-whole-f!") \ EM(netfs_modify_and_clear, "mod-n-clear") \ EM(netfs_streaming_write, "mod-streamw") \ EM(netfs_streaming_write_cont, "mod-streamw+") \