public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/18] btrfs: make send scale and perform better with shared extents
@ 2022-11-01 16:15 fdmanana
  2022-11-01 16:15 ` [PATCH 01/18] btrfs: fix inode list leak during backref walking at resolve_indirect_refs() fdmanana
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: fdmanana @ 2022-11-01 16:15 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There are two problems with send regarding cloned extents:

1) Sometimes it ends up not cloning whole extents, but only a section of
   the extents, reducing in less extent sharing at the receiver and extra
   IO on the send side (reading data, issuing write commands) and on the
   receiver side too (writing more data). This is not only not optimal
   but it also surprises users and often gets reported (such as in the
   thread referenced in patch 09/18);

2) When we find that a data extent is directly shared more than 64 times,
   we don't attempt to clone it, because that requires backref walking to
   determine from which inode and range we should clone from and for
   extents with many backreferences, that can be too slow, specially if
   we have many thousands of extents with a huge amount of sharing each.

This patchset solves the first problem completely (patch 09/18), and for
the second issue while not fully eliminated, it's significantly improved.
In a test scenario with 50 000 files where each file is reflinked 50 times,
there's a performance improvement of ~70% to ~75% for both full and
incremental send operations. This test and results are in the changelog
of patch 17/18.

After this we can now bump the limit from 64 max references to 1024, which
is still a conservative value, but the goal is to get rid of such limit in
the future (some more work required for that, but we're getting there).

There's also a nice and simple performance optimization when processing
extents that are not shared and we are using only one clone source (the
send root itself, very common), with gains varying between ~9% to ~18%
in some small scale tests where there are no shared extents or the majority
of the extents are not shared. That's patch 08/18.

The rest is just refactoring and cleanups in preparation for the optimization
work for send, and a few bug fixes for error paths in the backref walking
code and qgroup self tests. In particular the error paths for backref walking
are important because with the latest patches they are triggered not just in
case an error happens but also when the backref walking callbacks tell the
backref walking code to stop early.

More details in the changelogs of the patches.

I've also left this in a git tree at:

  https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=send_clone_performance_scalability

Filipe Manana (18):
  btrfs: fix inode list leak during backref walking at resolve_indirect_refs()
  btrfs: fix inode list leak during backref walking at find_parent_nodes()
  btrfs: fix ulist leaks in error paths of qgroup self tests
  btrfs: remove pointless and double ulist frees in error paths of qgroup tests
  btrfs: send: avoid unnecessary path allocations when finding extent clone
  btrfs: send: update comment at find_extent_clone()
  btrfs: send: drop unnecessary backref context field initializations
  btrfs: send: avoid unnecessary backref lookups when finding clone source
  btrfs: send: optimize clone detection to increase extent sharing
  btrfs: use a single argument for extent offset in backref walking functions
  btrfs: use a structure to pass arguments to backref walking functions
  btrfs: reuse roots ulist on each leaf iteration for iterate_extent_inodes()
  btrfs: constify ulist parameter of ulist_next()
  btrfs: send: cache leaf to roots mapping during backref walking
  btrfs: send: skip unnecessary backref iterations
  btrfs: send: avoid double extent tree search when finding clone source
  btrfs: send: skip resolution of our own backref when finding clone source
  btrfs: send: bump the extent reference count limit for backref walking

 fs/btrfs/backref.c            | 596 ++++++++++++++++++++--------------
 fs/btrfs/backref.h            | 137 +++++++-
 fs/btrfs/qgroup.c             |  38 ++-
 fs/btrfs/relocation.c         |  19 +-
 fs/btrfs/scrub.c              |  18 +-
 fs/btrfs/send.c               | 467 +++++++++++++++++++-------
 fs/btrfs/tests/qgroup-tests.c |  86 +++--
 fs/btrfs/ulist.c              |   2 +-
 fs/btrfs/ulist.h              |   2 +-
 9 files changed, 928 insertions(+), 437 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-11-02 16:01 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-01 16:15 [PATCH 00/18] btrfs: make send scale and perform better with shared extents fdmanana
2022-11-01 16:15 ` [PATCH 01/18] btrfs: fix inode list leak during backref walking at resolve_indirect_refs() fdmanana
2022-11-01 16:15 ` [PATCH 02/18] btrfs: fix inode list leak during backref walking at find_parent_nodes() fdmanana
2022-11-01 16:15 ` [PATCH 03/18] btrfs: fix ulist leaks in error paths of qgroup self tests fdmanana
2022-11-01 16:15 ` [PATCH 04/18] btrfs: remove pointless and double ulist frees in error paths of qgroup tests fdmanana
2022-11-01 16:15 ` [PATCH 05/18] btrfs: send: avoid unnecessary path allocations when finding extent clone fdmanana
2022-11-01 16:15 ` [PATCH 06/18] btrfs: send: update comment at find_extent_clone() fdmanana
2022-11-01 16:15 ` [PATCH 07/18] btrfs: send: drop unnecessary backref context field initializations fdmanana
2022-11-01 16:15 ` [PATCH 08/18] btrfs: send: avoid unnecessary backref lookups when finding clone source fdmanana
2022-11-01 16:15 ` [PATCH 09/18] btrfs: send: optimize clone detection to increase extent sharing fdmanana
2022-11-01 16:15 ` [PATCH 10/18] btrfs: use a single argument for extent offset in backref walking functions fdmanana
2022-11-01 16:15 ` [PATCH 11/18] btrfs: use a structure to pass arguments to " fdmanana
2022-11-01 16:15 ` [PATCH 12/18] btrfs: reuse roots ulist on each leaf iteration for iterate_extent_inodes() fdmanana
2022-11-01 16:15 ` [PATCH 13/18] btrfs: constify ulist parameter of ulist_next() fdmanana
2022-11-01 16:15 ` [PATCH 14/18] btrfs: send: cache leaf to roots mapping during backref walking fdmanana
2022-11-01 16:15 ` [PATCH 15/18] btrfs: send: skip unnecessary backref iterations fdmanana
2022-11-01 16:15 ` [PATCH 16/18] btrfs: send: avoid double extent tree search when finding clone source fdmanana
2022-11-01 16:15 ` [PATCH 17/18] btrfs: send: skip resolution of our own backref " fdmanana
2022-11-01 16:15 ` [PATCH 18/18] btrfs: send: bump the extent reference count limit for backref walking fdmanana
2022-11-02 16:01 ` [PATCH 00/18] btrfs: make send scale and perform better with shared extents David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox