public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] btrfs: reduce the memory usage for btrfs_io_context, and reduce its variable sized members
@ 2023-02-07  4:26 Qu Wenruo
  2023-02-07  4:26 ` [PATCH v2 1/4] btrfs: simplify the @bioc argument for handle_ops_on_dev_replace() Qu Wenruo
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Qu Wenruo @ 2023-02-07  4:26 UTC (permalink / raw)
  To: linux-btrfs

Changelog:
v2:
- Address the comments on various grammar errors

- Further reduce the memory used for replace
  Now instead of two s16, it's just one s16 now.

- Replace raid_map with a single u64
  This not only reduce the memory usage, but also makes btrfs_io_context
  to only have one variable sized member (stripes), simplify the
  allocation function

  This also removes the need to bubble sort the stripes.


In btrfs_io_context, we have two members dedicated to replace, and one
extra array for raid56

- num_tgtdevs
  This is straight-forward, just the number of extra stripes for replace
  usage.

- tgtdev_map[]
  This is a little complex, it represents the mapping between the
  original stripes and dev-replace stripes.

  This is mostly for RAID56, as only in RAID56 the stripes contain
  different contents, thus it's important to know the mapping.

  It goes like this:

    num_stripes = 4 (3 + 1 for replace)
    stripes[0]:		dev = devid 1, physical = X
    stripes[1]:		dev = devid 2, physical = Y
    stripes[2]:		dev = devid 3, physical = Z
    stripes[3]:		dev = devid 0, physical = Y

    num_tgtdevs = 1
    tgtdev_map[0] = 0	<- Means stripes[0] is not involved in replace.
    tgtdev_map[1] = 3	<- Means stripes[1] is involved in replace,
			   and it's duplicated to stripes[3].
    tgtdev_map[2] = 0	<- Means stripes[2] is not involved in replace.

  Thus most space is wasted, and the more devices in the array, the more
  space wasted.

- raid_map[]
  A sorted array where the first one is always the logical bytenr of
  the full stripe

  But the truth is, since it's always sorted, we don't need it at all.
  We can use a single u64 to indicate the full stripe start.

  Currently we're reusing the array mostly to re-order our stripes for
  RAID56, which is not ideal, because we can get it down right just in
  one go.

All these tgdev_map[] and raid_map[] designs are  wasting quite some
space, and making alloc_btrfs_io_context() to do very complex and
dangerous pointer juggling.

This patch will replace those members by:

- num_tgtdevs -> replace_nr_stripes
  Just a rename

- tgtdev_map[] -> replace_stripe_src
  It's changed to a single s16 to indicate where the source stripe is.
  This single s16 is enough for RAID56. For DUP, they don't care the
  source as all stripes share the same content.

- raid_map[] -> full_stripe_logical
  We remove the sort_parity_stripes(), and get the stripes selection
  done correctly in RAID56 routines.

  So we only need to record the logical bytenr of the full stripe start.
  Existing call sites checking the type of stripe can compare with
  their data stripes number to do the same work.

This not only saved some space for btrfs_io_context structure, but also
allows the following cleanups:

- Streamline handle_ops_on_dev_replace()
  We go a common path for both WRITE and GET_READ_MIRRORS, and only
  for DUP and GET_READ_MIRRORS, we shrink the bioc to keep the same
  old behavior.

- Remove some unnecessary variables

- Remove variable sized members
  Now there is only one variable sized member, stripes.

Although the series still increases the number of lines, the net
increase mostly comes from comments, in fact around 100 lines of comments
are added around the replace related members.

Qu Wenruo (4):
  btrfs: simplify the @bioc argument for handle_ops_on_dev_replace()
  btrfs: small improvement for btrfs_io_context structure
  btrfs: use a more space efficient way to represent the source of
    duplicated stripes
  btrfs: replace btrfs_io_context::raid_map[] with a fixed u64 value

 fs/btrfs/raid56.c            |  71 ++++++----
 fs/btrfs/scrub.c             |  30 ++--
 fs/btrfs/volumes.c           | 267 ++++++++++++++++-------------------
 fs/btrfs/volumes.h           |  73 ++++++++--
 include/trace/events/btrfs.h |   2 +-
 5 files changed, 246 insertions(+), 197 deletions(-)

-- 
2.39.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-02-23 21:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-07  4:26 [PATCH v2 0/4] btrfs: reduce the memory usage for btrfs_io_context, and reduce its variable sized members Qu Wenruo
2023-02-07  4:26 ` [PATCH v2 1/4] btrfs: simplify the @bioc argument for handle_ops_on_dev_replace() Qu Wenruo
2023-02-07  4:26 ` [PATCH v2 2/4] btrfs: small improvement for btrfs_io_context structure Qu Wenruo
2023-02-15 20:02   ` David Sterba
2023-02-17  5:03     ` Qu Wenruo
2023-02-23 20:55       ` David Sterba
2023-02-07  4:26 ` [PATCH v2 3/4] btrfs: use a more space efficient way to represent the source of duplicated stripes Qu Wenruo
2023-02-07  4:26 ` [PATCH v2 4/4] btrfs: replace btrfs_io_context::raid_map[] with a fixed u64 value Qu Wenruo
2023-02-15 20:07   ` David Sterba
2023-02-16  0:23     ` Qu Wenruo
2023-02-20  8:53   ` Geert Uytterhoeven
2023-02-20 11:50     ` Qu Wenruo
2023-02-20 12:14       ` Geert Uytterhoeven
2023-02-21  0:09         ` Qu Wenruo
2023-02-15 20:07 ` [PATCH v2 0/4] btrfs: reduce the memory usage for btrfs_io_context, and reduce its variable sized members David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox