From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [PATCH v2 0/4] btrfs: reduce the memory usage for btrfs_io_context, and reduce its variable sized members
Date: Tue, 7 Feb 2023 12:26:11 +0800 [thread overview]
Message-ID: <cover.1675743217.git.wqu@suse.com> (raw)
Changelog:
v2:
- Address the comments on various grammar errors
- Further reduce the memory used for replace
Now instead of two s16, it's just one s16 now.
- Replace raid_map with a single u64
This not only reduce the memory usage, but also makes btrfs_io_context
to only have one variable sized member (stripes), simplify the
allocation function
This also removes the need to bubble sort the stripes.
In btrfs_io_context, we have two members dedicated to replace, and one
extra array for raid56
- num_tgtdevs
This is straight-forward, just the number of extra stripes for replace
usage.
- tgtdev_map[]
This is a little complex, it represents the mapping between the
original stripes and dev-replace stripes.
This is mostly for RAID56, as only in RAID56 the stripes contain
different contents, thus it's important to know the mapping.
It goes like this:
num_stripes = 4 (3 + 1 for replace)
stripes[0]: dev = devid 1, physical = X
stripes[1]: dev = devid 2, physical = Y
stripes[2]: dev = devid 3, physical = Z
stripes[3]: dev = devid 0, physical = Y
num_tgtdevs = 1
tgtdev_map[0] = 0 <- Means stripes[0] is not involved in replace.
tgtdev_map[1] = 3 <- Means stripes[1] is involved in replace,
and it's duplicated to stripes[3].
tgtdev_map[2] = 0 <- Means stripes[2] is not involved in replace.
Thus most space is wasted, and the more devices in the array, the more
space wasted.
- raid_map[]
A sorted array where the first one is always the logical bytenr of
the full stripe
But the truth is, since it's always sorted, we don't need it at all.
We can use a single u64 to indicate the full stripe start.
Currently we're reusing the array mostly to re-order our stripes for
RAID56, which is not ideal, because we can get it down right just in
one go.
All these tgdev_map[] and raid_map[] designs are wasting quite some
space, and making alloc_btrfs_io_context() to do very complex and
dangerous pointer juggling.
This patch will replace those members by:
- num_tgtdevs -> replace_nr_stripes
Just a rename
- tgtdev_map[] -> replace_stripe_src
It's changed to a single s16 to indicate where the source stripe is.
This single s16 is enough for RAID56. For DUP, they don't care the
source as all stripes share the same content.
- raid_map[] -> full_stripe_logical
We remove the sort_parity_stripes(), and get the stripes selection
done correctly in RAID56 routines.
So we only need to record the logical bytenr of the full stripe start.
Existing call sites checking the type of stripe can compare with
their data stripes number to do the same work.
This not only saved some space for btrfs_io_context structure, but also
allows the following cleanups:
- Streamline handle_ops_on_dev_replace()
We go a common path for both WRITE and GET_READ_MIRRORS, and only
for DUP and GET_READ_MIRRORS, we shrink the bioc to keep the same
old behavior.
- Remove some unnecessary variables
- Remove variable sized members
Now there is only one variable sized member, stripes.
Although the series still increases the number of lines, the net
increase mostly comes from comments, in fact around 100 lines of comments
are added around the replace related members.
Qu Wenruo (4):
btrfs: simplify the @bioc argument for handle_ops_on_dev_replace()
btrfs: small improvement for btrfs_io_context structure
btrfs: use a more space efficient way to represent the source of
duplicated stripes
btrfs: replace btrfs_io_context::raid_map[] with a fixed u64 value
fs/btrfs/raid56.c | 71 ++++++----
fs/btrfs/scrub.c | 30 ++--
fs/btrfs/volumes.c | 267 ++++++++++++++++-------------------
fs/btrfs/volumes.h | 73 ++++++++--
include/trace/events/btrfs.h | 2 +-
5 files changed, 246 insertions(+), 197 deletions(-)
--
2.39.1
next reply other threads:[~2023-02-07 4:26 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-07 4:26 Qu Wenruo [this message]
2023-02-07 4:26 ` [PATCH v2 1/4] btrfs: simplify the @bioc argument for handle_ops_on_dev_replace() Qu Wenruo
2023-02-07 4:26 ` [PATCH v2 2/4] btrfs: small improvement for btrfs_io_context structure Qu Wenruo
2023-02-15 20:02 ` David Sterba
2023-02-17 5:03 ` Qu Wenruo
2023-02-23 20:55 ` David Sterba
2023-02-07 4:26 ` [PATCH v2 3/4] btrfs: use a more space efficient way to represent the source of duplicated stripes Qu Wenruo
2023-02-07 4:26 ` [PATCH v2 4/4] btrfs: replace btrfs_io_context::raid_map[] with a fixed u64 value Qu Wenruo
2023-02-15 20:07 ` David Sterba
2023-02-16 0:23 ` Qu Wenruo
2023-02-20 8:53 ` Geert Uytterhoeven
2023-02-20 11:50 ` Qu Wenruo
2023-02-20 12:14 ` Geert Uytterhoeven
2023-02-21 0:09 ` Qu Wenruo
2023-02-15 20:07 ` [PATCH v2 0/4] btrfs: reduce the memory usage for btrfs_io_context, and reduce its variable sized members David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1675743217.git.wqu@suse.com \
--to=wqu@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox