From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from fout-a2-smtp.messagingengine.com (fout-a2-smtp.messagingengine.com [103.168.172.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFBB41DE8B3; Mon, 12 May 2025 14:00:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.145 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747058412; cv=none; b=QHW9gVNj3kk1hWBISZPYyC5/B5QX3buuBhbrjtpUvalVtHEt8LqJe/20hgmcBEHPoP3dqmWCE2bGF/zHRbEenVO5F/pg9WK0nPM2qf1Kv9cpuyExikplEUFJQGycrLLq8GtPgkKHJUIWd8h/tLKgLMxQzNgbvnKGdZEhBaGZMgY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747058412; c=relaxed/simple; bh=wpCAGg5FMDdgzOinb2lO8BlPmjr9jTQeq0/EI89fN9I=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=c8EqKpm0YX0/bwDgKS2xfpQ8gv8vChkJka5d9jXWjVbkq0EPji0yOXY2u0QeHCOU7O0cMZOwlZZpnVzhNb44+NQoUcFSQhbkSB4maN8Gkw5N55obPfgxjl06fULHsEs4l+IC3YOUTiIt4MSPnvoQHBQsFqGFoAhwKGrYFrVgCKc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=kroah.com; spf=pass smtp.mailfrom=kroah.com; dkim=pass (2048-bit key) header.d=kroah.com header.i=@kroah.com header.b=QS0NpRGF; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=O0Eaeo+Q; arc=none smtp.client-ip=103.168.172.145 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=kroah.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kroah.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kroah.com header.i=@kroah.com header.b="QS0NpRGF"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="O0Eaeo+Q" Received: from phl-compute-05.internal (phl-compute-05.phl.internal [10.202.2.45]) by mailfout.phl.internal (Postfix) with ESMTP id 9E7EC1380211; Mon, 12 May 2025 10:00:07 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-05.internal (MEProxy); Mon, 12 May 2025 10:00:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kroah.com; h=cc :cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1747058407; x=1747144807; bh=g9YzhNXKHW zISuAoilHGKzXC61fbuoHz4GjK/DLdRY4=; b=QS0NpRGFlyZ5H+18CNSg/pCRue eQ8tO+7mNx0Pg3SHY7UuBc1+4csrOPDG2ahkcRnAw/C6Ca1292njyv9JOEW2X/yI RJCtxdTodd8NHSVRzzi64LSbjeSy3/nNSzIUIayq+pMfQKeW5VIEq3wYu0ktqSj7 T1bkeK2LLKDiwr4zR2jjMPxineoQHMGwceBGAdaJKtV4ruNfJaCJnZ00sdWWYp5r a0M4tieGs4d1tY/eCon6tIvnAkS6uqD7HKmkj4RQRDU5GLmG8qynN51ZmN7BUNHS im49wC1iN7+0z6f5LJSbz5QtlKYAZbp6rW5o3F3ZTXu395W+9w5BUVf1v9Fg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1747058407; x=1747144807; bh=g9YzhNXKHWzISuAoilHGKzXC61fbuoHz4Gj K/DLdRY4=; b=O0Eaeo+QQ1HQWxJUP7l9jw7dQ8Abp29mGHqGAukxmWARuaGUtDV 98qAMS78WFIe5q2fLpoHnOjDiVtCombA4o+TAZgQKPfFudnZSeWWFKRBr29Wrk0y 49b2jwZD/N5sw9a/o7zx7SQcU6CKMnm8eB5pywYSl2gEXvZGphcQSmJFQrWsQoyB tbIiVVY0EJFe6Y+frF70phgil+1cGUqioxsKLl+WI3Ld2yaTz17JlhGD1Ghkc447 XUiDEF1KAMdE8J5ELCX/7/E8R70aQ51AgO9DYFVBhPWy3c1dTMENuqJ5xRj8Pbef IXYzHOGqhpIO0qkGZEDJl+sWCKGaqD+/q3w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgdeftddugeehucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggv pdfurfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpih gvnhhtshculddquddttddmnecujfgurhepfffhvfevuffkfhggtggujgesthdtredttddt vdenucfhrhhomhepifhrvghgucfmjfcuoehgrhgvgheskhhrohgrhhdrtghomheqnecugg ftrfgrthhtvghrnhepheegvdevvdeljeeugfdtudduhfekledtiefhveejkeejuefhtdeu fefhgfehkeetnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrh homhepghhrvghgsehkrhhorghhrdgtohhmpdhnsggprhgtphhtthhopeduvddpmhhouggv pehsmhhtphhouhhtpdhrtghpthhtohepfihquhesshhushgvrdgtohhmpdhrtghpthhtoh eplhhinhhugidqsghtrhhfshesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthho pehsthgrsghlvgesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehhtghhse hinhhfrhgruggvrggurdhorhhgpdhrtghpthhtohepfhgumhgrnhgrnhgrsehsuhhsvgdr tghomhdprhgtphhtthhopegushhtvghrsggrsehsuhhsvgdrtghomh X-ME-Proxy: Feedback-ID: i787e41f1:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 12 May 2025 10:00:06 -0400 (EDT) Date: Mon, 12 May 2025 16:00:05 +0200 From: Greg KH To: Qu Wenruo Cc: linux-btrfs@vger.kernel.org, stable@vger.kernel.org, Christoph Hellwig , Filipe Manana , David Sterba Subject: Re: [PATCH 6.12.y] btrfs: always fallback to buffered write if the inode requires checksum Message-ID: <2025051245-engine-overthrow-8820@gregkh> References: <968f19c5b1b7d5595423b0ac0020cc18dfed8cb5.1746665263.git.wqu@suse.com> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <968f19c5b1b7d5595423b0ac0020cc18dfed8cb5.1746665263.git.wqu@suse.com> On Thu, May 08, 2025 at 10:31:25AM +0930, Qu Wenruo wrote: > commit 968f19c5b1b7d5595423b0ac0020cc18dfed8cb5 upstream. > > [BUG] > It is a long known bug that VM image on btrfs can lead to data csum > mismatch, if the qemu is using direct-io for the image (this is commonly > known as cache mode 'none'). > > [CAUSE] > Inside the VM, if the fs is EXT4 or XFS, or even NTFS from Windows, the > fs is allowed to dirty/modify the folio even if the folio is under > writeback (as long as the address space doesn't have AS_STABLE_WRITES > flag inherited from the block device). > > This is a valid optimization to improve the concurrency, and since these > filesystems have no extra checksum on data, the content change is not a > problem at all. > > Bu the final write into the image file is handled by btrfs, which needs > the content not to be modified during writeback, or the checksum will > not match the data (checksum is calculated before submitting the bio). > > So EXT4/XFS/NTRFS assume they can modify the folio under writeback, but > btrfs requires no modification, this leads to the false csum mismatch. > > This is only a controlled example, there are even cases where > multi-thread programs can submit a direct IO write, then another thread > modifies the direct IO buffer for whatever reason. > > For such cases, btrfs has no sane way to detect such cases and leads to > false data csum mismatch. > > [FIX] > I have considered the following ideas to solve the problem: > > - Make direct IO to always skip data checksum > This not only requires a new incompatible flag, as it breaks the > current per-inode NODATASUM flag. > But also requires extra handling for no csum found cases. > > And this also reduces our checksum protection. > > - Let hardware handle all the checksum > AKA, just nodatasum mount option. > That requires trust for hardware (which is not that trustful in a lot > of cases), and it's not generic at all. > > - Always fallback to buffered write if the inode requires checksum > This was suggested by Christoph, and is the solution utilized by this > patch. > > The cost is obvious, the extra buffer copying into page cache, thus it > reduces the performance. > But at least it's still user configurable, if the end user still wants > the zero-copy performance, just set NODATASUM flag for the inode > (which is a common practice for VM images on btrfs). > > Since we cannot trust user space programs to keep the buffer > consistent during direct IO, we have no choice but always falling back > to buffered IO. At least by this, we avoid the more deadly false data > checksum mismatch error. > > CC: stable@vger.kernel.org # 6.12+ > Suggested-by: Christoph Hellwig > Reviewed-by: Filipe Manana > Signed-off-by: Qu Wenruo > Reviewed-by: David Sterba > Signed-off-by: David Sterba > --- > fs/btrfs/direct-io.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) What about 6.14.y? It is needed there too, right? We can't take a patch only for an older branch and not a newer one. thanks, greg k-h