[PATCH AUTOSEL 6.14 099/642] btrfs: prevent inline data extents read from touching blocks beyond its range

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Qu Wenruo <wqu@suse.com>, Filipe Manana <fdmanana@suse.com>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>,
	clm@fb.com, josef@toxicpanda.com, linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.14 099/642] btrfs: prevent inline data extents read from touching blocks beyond its range
Date: Mon,  5 May 2025 18:05:15 -0400	[thread overview]
Message-ID: <20250505221419.2672473-99-sashal@kernel.org> (raw)
In-Reply-To: <20250505221419.2672473-1-sashal@kernel.org>

From: Qu Wenruo <wqu@suse.com>

[ Upstream commit 1a5b5668d711d3d1ef447446beab920826decec3 ]

Currently reading an inline data extent will zero out the remaining
range in the page.

This is not yet causing problems even for block size < page size
(subpage) cases because:

1) An inline data extent always starts at file offset 0
   Meaning at page read, we always read the inline extent first, before
   any other blocks in the page. Then later blocks are properly read out
   and re-fill the zeroed out ranges.

2) Currently btrfs will read out the whole page if a buffered write is
   not page aligned
   So a page is either fully uptodate at buffered write time (covers the
   whole page), or we will read out the whole page first.
   Meaning there is nothing to lose for such an inline extent read.

But it's still not ideal:

- We're zeroing out the page twice
  Once done by read_inline_extent()/uncompress_inline(), once done by
  btrfs_do_readpage() for ranges beyond i_size.

- We're touching blocks that don't belong to the inline extent
  In the incoming patches, we can have a partial uptodate folio, of
  which some dirty blocks can exist while the page is not fully uptodate:

  The page size is 16K and block size is 4K:

  0         4K        8K        12K        16K
  |         |         |/////////|          |

  And range [8K, 12K) is dirtied by a buffered write, the remaining
  blocks are not uptodate.

  If range [0, 4K) contains an inline data extent, and we try to read
  the whole page, the current behavior will overwrite range [8K, 12K)
  with zero and cause data loss.

So to make the behavior more consistent and in preparation for future
changes, limit the inline data extents read to only zero out the range
inside the first block, not the whole page.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/inode.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3dcf9a428b2b4..c11d0a8e5b06d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6768,6 +6768,7 @@ static noinline int uncompress_inline(struct btrfs_path *path,
 {
 	int ret;
 	struct extent_buffer *leaf = path->nodes[0];
+	const u32 blocksize = leaf->fs_info->sectorsize;
 	char *tmp;
 	size_t max_size;
 	unsigned long inline_size;
@@ -6784,7 +6785,7 @@ static noinline int uncompress_inline(struct btrfs_path *path,
 
 	read_extent_buffer(leaf, tmp, ptr, inline_size);
 
-	max_size = min_t(unsigned long, PAGE_SIZE, max_size);
+	max_size = min_t(unsigned long, blocksize, max_size);
 	ret = btrfs_decompress(compress_type, tmp, folio, 0, inline_size,
 			       max_size);
 
@@ -6796,14 +6797,15 @@ static noinline int uncompress_inline(struct btrfs_path *path,
 	 * cover that region here.
 	 */
 
-	if (max_size < PAGE_SIZE)
-		folio_zero_range(folio, max_size, PAGE_SIZE - max_size);
+	if (max_size < blocksize)
+		folio_zero_range(folio, max_size, blocksize - max_size);
 	kfree(tmp);
 	return ret;
 }
 
 static int read_inline_extent(struct btrfs_path *path, struct folio *folio)
 {
+	const u32 blocksize = path->nodes[0]->fs_info->sectorsize;
 	struct btrfs_file_extent_item *fi;
 	void *kaddr;
 	size_t copy_size;
@@ -6818,14 +6820,14 @@ static int read_inline_extent(struct btrfs_path *path, struct folio *folio)
 	if (btrfs_file_extent_compression(path->nodes[0], fi) != BTRFS_COMPRESS_NONE)
 		return uncompress_inline(path, folio, fi);
 
-	copy_size = min_t(u64, PAGE_SIZE,
+	copy_size = min_t(u64, blocksize,
 			  btrfs_file_extent_ram_bytes(path->nodes[0], fi));
 	kaddr = kmap_local_folio(folio, 0);
 	read_extent_buffer(path->nodes[0], kaddr,
 			   btrfs_file_extent_inline_start(fi), copy_size);
 	kunmap_local(kaddr);
-	if (copy_size < PAGE_SIZE)
-		folio_zero_range(folio, copy_size, PAGE_SIZE - copy_size);
+	if (copy_size < blocksize)
+		folio_zero_range(folio, copy_size, blocksize - copy_size);
 	return 0;
 }
 
-- 
2.39.5

next prev parent reply	other threads:[~2025-05-05 22:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250505221419.2672473-1-sashal@kernel.org>
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 093/642] btrfs: make btrfs_discard_workfn() block_group ref explicit Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 094/642] btrfs: avoid linker error in btrfs_find_create_tree_block() Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 095/642] btrfs: run btrfs_error_commit_super() early Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 096/642] btrfs: fix non-empty delayed iputs list on unmount due to async workers Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 097/642] btrfs: properly limit inline data extent according to block size Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 098/642] btrfs: allow buffered write to avoid full page read if it's block aligned Sasha Levin
2025-05-05 22:05 ` Sasha Levin [this message]
2025-05-06 13:19   ` [PATCH AUTOSEL 6.14 099/642] btrfs: prevent inline data extents read from touching blocks beyond its range David Sterba
2025-05-20 14:15     ` Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 100/642] btrfs: get zone unusable bytes while holding lock at btrfs_reclaim_bgs_work() Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 101/642] btrfs: send: return -ENAMETOOLONG when attempting a path that is too long Sasha Levin
2025-05-05 22:05 ` [PATCH AUTOSEL 6.14 102/642] btrfs: zoned: exit btrfs_can_activate_zone if BTRFS_FS_NEED_ZONE_FINISH is set Sasha Levin

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:3dcf9a428b2b dfblob:c11d0a8e5b06 )
 OR (
bs:"[PATCH AUTOSEL 6.14 099/642] btrfs: prevent inline data extents read from touching blocks beyond its range" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250505221419.2672473-99-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox