From: David Sterba <dsterba@suse.com>
To: torvalds@linux-foundation.org
Cc: David Sterba <dsterba@suse.com>,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [GIT PULL] Btrfs updates for 6.1
Date: Tue, 4 Oct 2022 08:31:21 +0200 [thread overview]
Message-ID: <cover.1664798047.git.dsterba@suse.com> (raw)
Hi,
please pull the following updates for btrfs. There's a bunch of
performance improvements, most notably the FIEMAP speedup, the new block
group tree to speed up mount on large filesystems, more io_uring
integration, some sysfs exports and the usual fixes and core updates.
Thanks.
---
Performance:
- outstanding FIEMAP speed improvement
- algorithmic change how extents are enumerated leads to orders of
magnitude speed boost (uncached and cached)
- extent sharing check speedup (2.2x uncached, 3x cached)
- add more cancellation points, allowing to interrupt seeking in files
with large number of extents
- more efficient hole and data seeking (4x uncached, 1.3x cached)
- sample results:
256M, 32K extents: 4s -> 29ms (~150x)
512M, 64K extents: 30s -> 59ms (~550x)
1G, 128K extents: 225s -> 120ms (~1800x)
- improved inode logging, especially for directories (on dbench workload
throughput +25%, max latency -21%)
- improved buffered IO, remove redundant extent state tracking, lowering
memory consumption and avoiding rb tree traversal
- add sysfs tunable to let qgroup temporarily skip exact accounting when
deleting snapshot, leading to a speedup but requiring a rescan after
that, will be used by snapper
- support io_uring and buffered writes, until now it was just for direct
IO, with the no-wait semantics implemented in the buffered write path
it now works and leads to speed improvement in IOPS (2x), throughput
(2.2x), latency (depends, 2x to 150x)
- small performance improvements when dropping and searching for extent
maps as well as when flushing delalloc in COW mode (throughput +5MB/s)
User visible changes:
- new incompatible feature block-group-tree adding a dedicated tree for
tracking block groups, this allows a much faster load during mount and
avoids seeking unlike when it's scattered in the extent tree items
- this reduces mount time for many-terabyte sized filesystems
- conversion tool will be provided so existing filesystem can also be
updated in place
- to reduce test matrix and feature combinations requires no-holes
and free-space-tree (mkfs defaults since 5.15)
- improved reporting of super block corruption detected by scrub
- scrub also tries to repair super block and does not wait until next
commit
- discard stats and tunables are exported in sysfs
(/sys/fs/btrfs/FSID/discard)
- qgroup status is exported in sysfs (/sys/sys/fs/btrfs/FSID/qgroups/)
- verify that super block was not modified when thawing filesystem
Fixes:
- FIEMAP fixes
- fix extent sharing status, does not depend on the cached status where
merged
- flush delalloc so compressed extents are reported correctly
- fix alignment of VMA for memory mapped files on THP
- send: fix failures when processing inodes with no links (orphan files
and directories)
- fix race between quota enable and quota rescan ioctl
- handle more corner cases for read-only compat feature verification
- fix missed extent on fsync after dropping extent maps
Core:
- lockdep annotations to validate various transactions states and state
transitions
- preliminary support for fs-verity in send
- more effective memory use in scrub for subpage where sector is smaller
than page
- block group caching progress logic has been removed, load is now
synchronous
- simplify end IO callbacks and bio handling, use chained bios instead
of own tracking
- add no-wait semantics to several functions (tree search, nocow,
flushing, buffered write
- cleanups and refactoring
MM changes:
- export balance_dirty_pages_ratelimited_flags
----------------------------------------------------------------
The following changes since commit f76349cf41451c5c42a99f18a9163377e4b364ff:
Linux 6.0-rc7 (2022-09-25 14:01:02 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-6.1-tag
for you to fetch changes up to cbddcc4fa3443fe8cfb2ff8e210deb1f6a0eea38:
btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer (2022-09-29 17:08:31 +0200)
----------------------------------------------------------------
Alexander Zhu (1):
btrfs: fix alignment of VMA for memory mapped files on THP
BingJing Chang (2):
btrfs: send: refactor arguments of get_inode_info()
btrfs: send: fix failures when processing inodes with no links
Boris Burkov (1):
btrfs: send: add support for fs-verity
Christoph Hellwig (13):
btrfs: don't create integrity bioset for btrfs_bioset
btrfs: move btrfs_bio allocation to volumes.c
btrfs: pass the operation to btrfs_bio_alloc
btrfs: don't take a bio_counter reference for cloned bios
btrfs: use chained bios when cloning
btrfs: properly abstract the parity raid bio handling
btrfs: give struct btrfs_bio a real end_io handler
btrfs: factor out low-level bio setup from submit_stripe_bio
btrfs: decide bio cloning inside submit_stripe_bio
btrfs: add fast path for single device io in __btrfs_map_block
btrfs: stop allocation a btrfs_io_context for simple I/O
btrfs: zoned: refactor device checks in btrfs_check_zoned_mode
btrfs: stop tracking failed reads in the I/O tree
Christophe JAILLET (1):
btrfs: qgroup: fix a typo in a comment
David Sterba (3):
btrfs: sysfs: use sysfs_streq for string matching
btrfs: sysfs: show discard stats and tunables in non-debug build
btrfs: add KCSAN annotations for unlocked access to block_rsv->full
Ethan Lien (1):
btrfs: remove unnecessary EXTENT_UPTODATE state in buffered I/O path
Filipe Manana (42):
btrfs: don't drop dir index range items when logging a directory
btrfs: remove the root argument from log_new_dir_dentries()
btrfs: update stale comment for log_new_dir_dentries()
btrfs: free list element sooner at log_new_dir_dentries()
btrfs: avoid memory allocation at log_new_dir_dentries() for common case
btrfs: remove root argument from btrfs_delayed_item_reserve_metadata()
btrfs: store index number instead of key in struct btrfs_delayed_item
btrfs: remove unused logic when looking up delayed items
btrfs: shrink the size of struct btrfs_delayed_item
btrfs: search for last logged dir index if it's not cached in the inode
btrfs: move need_log_inode() to above log_conflicting_inodes()
btrfs: move log_new_dir_dentries() above btrfs_log_inode()
btrfs: log conflicting inodes without holding log mutex of the initial inode
btrfs: skip logging parent dir when conflicting inode is not a dir
btrfs: use delayed items when logging a directory
btrfs: simplify adding and replacing references during log replay
btrfs: simplify error handling at btrfs_del_root_ref()
btrfs: fix race between quota enable and quota rescan ioctl
btrfs: allow hole and data seeking to be interruptible
btrfs: make hole and data seeking a lot more efficient
btrfs: remove check for impossible block start for an extent map at fiemap
btrfs: remove zero length check when entering fiemap
btrfs: properly flush delalloc when entering fiemap
btrfs: allow fiemap to be interruptible
btrfs: rename btrfs_check_shared() to a more descriptive name
btrfs: speedup checking for extent sharedness during fiemap
btrfs: skip unnecessary extent buffer sharedness checks during fiemap
btrfs: make fiemap more efficient and accurate reporting extent sharedness
btrfs: remove useless used space increment during space reservation
btrfs: fix missed extent on fsync after dropping extent maps
btrfs: move btrfs_drop_extent_cache() to extent_map.c
btrfs: use extent_map_end() at btrfs_drop_extent_map_range()
btrfs: use cond_resched_rwlock_write() during inode eviction
btrfs: move open coded extent map tree deletion out of inode eviction
btrfs: add helper to replace extent map range with a new extent map
btrfs: remove the refcount warning/check at free_extent_map()
btrfs: remove unnecessary extent map initializations
btrfs: assert tree is locked when clearing extent map from logging
btrfs: remove unnecessary NULL pointer checks when searching extent maps
btrfs: remove unnecessary next extent map search
btrfs: avoid pointless extent map tree search when flushing delalloc
btrfs: drop extent map range more efficiently
Gaosheng Cui (1):
btrfs: remove btrfs_bit_radix_cachep declaration
Ioannis Angelakopoulos (7):
btrfs: add macros for annotating wait events with lockdep
btrfs: add lockdep annotations for num_writers wait event
btrfs: add lockdep annotations for num_extwriters wait event
btrfs: add lockdep annotations for transaction states wait events
btrfs: add lockdep annotations for pending_ordered wait event
btrfs: change the lockdep class of free space inode's invalidate_lock
btrfs: add lockdep annotations for the ordered extents wait event
Jeff Layton (1):
btrfs: remove stale prototype of btrfs_write_inode
Josef Bacik (65):
btrfs: use btrfs_fs_closing for background bg work
btrfs: simplify arguments of btrfs_update_space_info and rename
btrfs: handle space_info setting of bg in btrfs_add_bg_to_space_info
btrfs: convert block group bit field to use bit helpers
btrfs: remove lock protection for BLOCK_GROUP_FLAG_TO_COPY
btrfs: simplify block group traversal in btrfs_put_block_group_cache
btrfs: remove BLOCK_GROUP_FLAG_HAS_CACHING_CTL
btrfs: remove lock protection for BLOCK_GROUP_FLAG_RELOCATING_REPAIR
btrfs: delete btrfs_wait_space_cache_v1_finished
btrfs: call __btrfs_remove_free_space_cache_locked on cache load failure
btrfs: remove use btrfs_remove_free_space_cache instead of variant
btrfs: rename clean_io_failure and remove extraneous args
btrfs: unexport internal failrec functions
btrfs: convert the io_failure_tree to a plain rb_tree
btrfs: use find_first_extent_bit in btrfs_clean_io_failure
btrfs: separate out the extent state and extent buffer init code
btrfs: separate out the eb and extent state leak helpers
btrfs: temporarily export alloc_extent_state helpers
btrfs: move extent state init and alloc functions to their own file
btrfs: convert BUG_ON(EXTENT_BIT_LOCKED) checks to ASSERT's
btrfs: move simple extent bit helpers out of extent_io.c
btrfs: export wait_extent_bit
btrfs: move btrfs_debug_check_extent_io_range into extent-io-tree.c
btrfs: temporarily export and move core extent_io_tree tree functions
btrfs: temporarily export and then move extent state helpers
btrfs: move a few exported extent_io_tree helpers to extent-io-tree.c
btrfs: move core extent_io_tree functions to extent-io-tree.c
btrfs: unexport btrfs_debug_check_extent_io_range
btrfs: unexport all the temporary exports for extent-io-tree.c
btrfs: remove struct tree_entry in extent-io-tree.c
btrfs: use next_state instead of rb_next where we can
btrfs: make tree_search return struct extent_state
btrfs: make tree_search_for_insert return extent_state
btrfs: make tree_search_prev_next return extent_state's
btrfs: use next_state/prev_state in merge_state
btrfs: move extent io tree unrelated prototypes to their appropriate header
btrfs: drop exclusive_bits from set_extent_bit
btrfs: remove the wake argument from clear_extent_bits
btrfs: remove failed_start argument from set_extent_bit
btrfs: drop extent_changeset from set_extent_bit
btrfs: unify the lock/unlock extent variants
btrfs: remove extent_io_tree::track_uptodate
btrfs: get rid of extent_io_tree::dirty_bytes
btrfs: don't clear CTL bits when trying to release extent state
btrfs: replace delete argument with EXTENT_CLEAR_ALL_BITS
btrfs: don't init io tree with private data for non-inodes
btrfs: remove is_data_inode() checks in extent-io-tree.c
btrfs: move btrfs_caching_type to block-group.h
btrfs: move btrfs_full_stripe_locks_tree into block-group.h
btrfs: move btrfs_init_async_reclaim_work prototype to space-info.h
btrfs: move btrfs_pinned_by_swapfile prototype into volumes.h
btrfs: move btrfs_swapfile_pin into volumes.h
btrfs: move fs_info forward declarations to the top of ctree.h
btrfs: move btrfs_csum_ptr to inode.c
btrfs: move the fs_info related helpers closer to fs_info in ctree.h
btrfs: move btrfs_ordered_sum_size into file-item.c
btrfs: open code and remove btrfs_inode_sectorsize helper
btrfs: open code and remove btrfs_insert_inode_hash helper
btrfs: use a runtime flag to indicate an inode is a free space inode
btrfs: add struct declarations in dev-replace.h
btrfs: implement a nowait option for tree searches
btrfs: make can_nocow_extent nowait compatible
btrfs: add the ability to use NO_FLUSH for data reservations
btrfs: add btrfs_try_lock_ordered_range
btrfs: make btrfs_check_nocow_lock nowait compatible
Maciej S. Szmigiero (1):
btrfs: don't print information about space cache or tree every remount
Omar Sandoval (2):
btrfs: rename btrfs_insert_file_extent() to btrfs_insert_hole_extent()
btrfs: get rid of block group caching progress logic
Qu Wenruo (26):
btrfs: dump extra info if one free space cache has more bitmaps than it should
btrfs: scrub: properly report super block errors in system log
btrfs: scrub: try to fix super block errors
btrfs: scrub: remove impossible sanity checks
btrfs: scrub: use pointer array to replace sblocks_for_recheck
btrfs: scrub: factor out initialization of scrub_block into helper
btrfs: scrub: factor out allocation and initialization of scrub_sector into helper
btrfs: scrub: introduce scrub_block::pages for more efficient memory usage for subpage
btrfs: scrub: remove scrub_sector::page and use scrub_block::pages instead
btrfs: scrub: move logical/physical/dev/mirror_num from scrub_sector to scrub_block
btrfs: scrub: use larger block size for data extent scrub
btrfs: check superblock to ensure the fs was not modified at thaw time
btrfs: output human readable space info flag
btrfs: dump all space infos if we abort transaction due to ENOSPC
btrfs: enhance unsupported compat RO flags handling
btrfs: don't save block group root into super block
btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2
btrfs: sysfs: introduce global qgroup attribute group
btrfs: introduce BTRFS_QGROUP_STATUS_FLAGS_MASK for later expansion
btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN
btrfs: introduce BTRFS_QGROUP_RUNTIME_FLAG_NO_ACCOUNTING to skip qgroup accounting
btrfs: skip subtree scan if it's too high to avoid low stall in btrfs_commit_transaction()
btrfs: update the comment for submit_extent_page()
btrfs: switch page and disk_bytenr argument position for submit_extent_page()
btrfs: move end_io_func argument to btrfs_bio_ctrl structure
btrfs: relax block-group-tree feature dependency checks
Stefan Roesch (7):
mm: export balance_dirty_pages_ratelimited_flags()
btrfs: make prepare_pages nowait compatible
btrfs: make lock_and_cleanup_extent_if_need nowait compatible
btrfs: plumb NOWAIT through the write path
btrfs: make btrfs_buffered_write nowait compatible
btrfs: assert nowait mode is not used for some btree search functions
btrfs: enable nowait async buffered writes
Tetsuo Handa (1):
btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer
Uros Bizjak (1):
btrfs: use atomic_try_cmpxchg in free_extent_buffer
zhang songyi (1):
btrfs: remove the unnecessary result variables
fs/btrfs/Makefile | 2 +-
fs/btrfs/backref.c | 155 +-
fs/btrfs/backref.h | 20 +-
fs/btrfs/block-group.c | 182 +--
fs/btrfs/block-group.h | 39 +-
fs/btrfs/block-rsv.c | 3 +-
fs/btrfs/block-rsv.h | 9 +
fs/btrfs/btrfs_inode.h | 25 +-
fs/btrfs/compression.c | 54 +-
fs/btrfs/ctree.c | 43 +-
fs/btrfs/ctree.h | 370 ++---
fs/btrfs/delalloc-space.c | 13 +-
fs/btrfs/delalloc-space.h | 3 +-
fs/btrfs/delayed-inode.c | 292 ++--
fs/btrfs/delayed-inode.h | 34 +-
fs/btrfs/dev-replace.c | 16 +-
fs/btrfs/dev-replace.h | 4 +
fs/btrfs/disk-io.c | 303 ++--
fs/btrfs/disk-io.h | 7 +-
fs/btrfs/extent-io-tree.c | 1673 +++++++++++++++++++++
fs/btrfs/extent-io-tree.h | 126 +-
fs/btrfs/extent-tree.c | 33 +-
fs/btrfs/extent_io.c | 2923 +++++++++----------------------------
fs/btrfs/extent_io.h | 17 +-
fs/btrfs/extent_map.c | 347 ++++-
fs/btrfs/extent_map.h | 8 +
fs/btrfs/file-item.c | 38 +-
fs/btrfs/file.c | 805 ++++++----
fs/btrfs/free-space-cache.c | 115 +-
fs/btrfs/free-space-cache.h | 1 -
fs/btrfs/free-space-tree.c | 8 -
fs/btrfs/inode.c | 516 +++----
fs/btrfs/ioctl.c | 24 +-
fs/btrfs/locking.c | 25 +
fs/btrfs/locking.h | 1 +
fs/btrfs/misc.h | 35 +
fs/btrfs/ordered-data.c | 50 +-
fs/btrfs/ordered-data.h | 13 +-
fs/btrfs/props.c | 5 +-
fs/btrfs/qgroup.c | 96 +-
fs/btrfs/qgroup.h | 3 +
fs/btrfs/raid56.c | 45 +-
fs/btrfs/raid56.h | 4 +-
fs/btrfs/reflink.c | 10 +-
fs/btrfs/relocation.c | 40 +-
fs/btrfs/root-tree.c | 16 +-
fs/btrfs/scrub.c | 668 +++++----
fs/btrfs/send.c | 461 +++---
fs/btrfs/send.h | 15 +-
fs/btrfs/space-info.c | 96 +-
fs/btrfs/space-info.h | 9 +-
fs/btrfs/super.c | 112 +-
fs/btrfs/sysfs.c | 172 ++-
fs/btrfs/tests/btrfs-tests.c | 2 +-
fs/btrfs/tests/extent-io-tests.c | 7 +-
fs/btrfs/tests/free-space-tests.c | 22 +-
fs/btrfs/tests/inode-tests.c | 10 +-
fs/btrfs/transaction.c | 162 +-
fs/btrfs/tree-log.c | 1593 ++++++++++++--------
fs/btrfs/tree-log.h | 8 +
fs/btrfs/verity.c | 3 +-
fs/btrfs/volumes.c | 353 +++--
fs/btrfs/volumes.h | 50 +-
fs/btrfs/zoned.c | 142 +-
fs/verity/fsverity_private.h | 2 -
include/linux/fsverity.h | 3 +
include/trace/events/btrfs.h | 2 -
include/uapi/linux/btrfs.h | 6 +
include/uapi/linux/btrfs_tree.h | 4 +
mm/page-writeback.c | 1 +
70 files changed, 7212 insertions(+), 5242 deletions(-)
create mode 100644 fs/btrfs/extent-io-tree.c
next reply other threads:[~2022-10-04 6:37 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-04 6:31 David Sterba [this message]
2022-10-07 1:01 ` [GIT PULL] Btrfs updates for 6.1 pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1664798047.git.dsterba@suse.com \
--to=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.