All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] btrfs: populate extent_map::generation when reading from disk
@ 2022-02-08  5:31 Qu Wenruo
  2022-02-08 16:39 ` Filipe Manana
  2022-02-08 18:42 ` David Sterba
  0 siblings, 2 replies; 3+ messages in thread
From: Qu Wenruo @ 2022-02-08  5:31 UTC (permalink / raw)
  To: linux-btrfs

[WEIRD BEHAVIOR]

When btrfs_get_extent() tries to get some file extent from disk, it
never populates extent_map::generation , leaving the value to be 0.

On the other hand, for extent map generated by IO, it will get its
generation properly set at finish_ordered_io()

 finish_ordered_io()
 |- unpin_extent_cache(gen = trans->transid)
    |- em->generation = gen;

[CAUSE]
Since extent_map::generation is mostly used by fsync code, and for fsync
they only care about modified extents, which all have their em::generation > 0.

Thus it's fine to not populate em read from disk for fsync.

[CORNER CASE]
However autodefrag also relies on em::geneartion to determine if one extent
needs to be defragged.

This unpopulated extent_map::geneartion can prevent the following autodefrag
case from working:

	mkfs.btrfs -f $dev
	mount $dev $mnt -o autodefrag

	# initial write to queue the inode for autodefrag
	xfs_io -f -c "pwrite 0 4k" $mnt/file
	sync

	# Real fragmented write
	xfs_io -f -s -c "pwrite -b 4096 0 32k" $mnt/file
	sync
	echo "=== before autodefrag ==="
	xfs_io -c "fiemap -v" $mnt/file

	# Drop cache to force em to be read from disk
	echo 3 > /proc/sys/vm/drop_caches
	mount -o remount,commit=1 $mnt
	sleep 3
	sync

	echo "=== After autodefrag ==="
	xfs_io -c "fiemap -v" $mnt/file
	umount $mnt

The result looks like this:

  === before autodefrag ===
  /mnt/btrfs/file:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..15]:         26672..26687        16   0x0
     1: [16..31]:        26656..26671        16   0x0
     2: [32..47]:        26640..26655        16   0x0
     3: [48..63]:        26624..26639        16   0x1
  === After autodefrag ===
  /mnt/btrfs/file:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..15]:         26672..26687        16   0x0
     1: [16..31]:        26656..26671        16   0x0
     2: [32..47]:        26640..26655        16   0x0
     3: [48..63]:        26624..26639        16   0x1

This fragmented 32K will not be defragged by autodefrag.

[FIX]
To make things less weird, just populate extent_map::generation when
reading file extents from disk.

This would make above fragmented extents to be properly defragged:

  == before autodefrag ===
  /mnt/btrfs/file:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..15]:         26672..26687        16   0x0
     1: [16..31]:        26656..26671        16   0x0
     2: [32..47]:        26640..26655        16   0x0
     3: [48..63]:        26624..26639        16   0x1
  === After autodefrag ===
  /mnt/btrfs/file:
   EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
     0: [0..63]:         26688..26751        64   0x1

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Update the commit message to include a reproducer
  Although this is not what we want (to reduce autodefrag IO),
  the behavior still worthy fixing anyway.
---
 fs/btrfs/file-item.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index 90c5c38836ab..9a3de652ada8 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -1211,6 +1211,7 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode,
 	extent_start = key.offset;
 	extent_end = btrfs_file_extent_end(path);
 	em->ram_bytes = btrfs_file_extent_ram_bytes(leaf, fi);
+	em->generation = btrfs_file_extent_generation(leaf, fi);
 	if (type == BTRFS_FILE_EXTENT_REG ||
 	    type == BTRFS_FILE_EXTENT_PREALLOC) {
 		em->start = extent_start;
-- 
2.35.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-02-08 18:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-08  5:31 [PATCH v2] btrfs: populate extent_map::generation when reading from disk Qu Wenruo
2022-02-08 16:39 ` Filipe Manana
2022-02-08 18:42 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.