All of lore.kernel.org
 help / color / mirror / Atom feed
From: liuwf@tutanota.com
To: Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: [RFC PATCH 1/1] btrfs:Try to reduce the operating frequency of free space cache trees when allocating unclustered
Date: Fri, 3 Feb 2023 01:58:49 +0100 (CET)	[thread overview]
Message-ID: <NNJm6i---3-9@tutanota.com> (raw)

As far as I understand, each time when we allocate a free space in unclustered
way (on a heavy-fragmented disk or maybe other cases), a struct btrfs_free_space
entry will be removed from two free space cache trees, then the entry's offset
and size are altered, after done we insert the entry into the two trees again.
These operatings are densy computing, this patch try to reduce tree operating
frequency based on two logic (fix me):

1 There is only one case that need to remove and reinsert an entry from/into
the tree when allocating is made from the offset-indexed tree:

An entry is overlapping with a bitmap AND that entry's start is ahead of
the bitmap's start. When the entry shrinked it's start position walks
towards higher address and may overcome the bitmap's start, in this case
the entry need removed and reinserted from/into the tree to get a right
order.
(Exhanging the two nodes may be better, but it seems more troublesome?)

A bitmap may be overlapped with many entries, but they are not be able to
change their start position to overcome the bitmap EXCEPT the one above
mentioned

All other conditions are not supposed to change the relative position of two
neighbour entries.


2 As for allocating from the bytes-indexed tree.
Because we always begin to pick the entry that has the biggest size
(rb_first_cached()) to tear free space from it, it has the most potentiality
for the biggest to keep it's top weight, imaging a Mibs-sized entry cutted away
several KBs, that may or may not changd it's size below the second entry, the
later case may be the one with more probability.

This patch try to check an entry's offset and size when changed, and compare
the result with the next entry's, In theory,  1/3 ~ 1/2 of operatings probably
could be saved.

This patch has been passed through xfstests. Any comments are appreciated.

Signed-off-by: Liu Weifeng 4019c26b5c0d5f6b <Liuwf@tutanota.com>
---

fs/btrfs/free-space-cache.c | 51 ++++++++++++++++++++++++++++++++++---
1 file changed, 47 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index f4023651dd68..1713f36a01cd 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -39,6 +39,8 @@ static int link_free_space(struct btrfs_free_space_ctl *ctl,
   struct btrfs_free_space *info);
static void unlink_free_space(struct btrfs_free_space_ctl *ctl,
      struct btrfs_free_space *info, bool update_stat);
+static void relink_free_space_entry(struct btrfs_free_space_ctl *ctl,
+     struct btrfs_free_space *space_info);
static int search_bitmap(struct btrfs_free_space_ctl *ctl,
struct btrfs_free_space *bitmap_info, u64 *offset,
u64 *bytes, bool for_alloc);
@@ -1840,6 +1842,47 @@ static int link_free_space(struct btrfs_free_space_ctl *ctl,
return ret;
}

+static void relink_free_space_entry(struct btrfs_free_space_ctl *ctl,
+     struct btrfs_free_space *info)
+{
+ struct rb_node *next_node = NULL;
+ struct btrfs_free_space *next_info_offset = NULL;
+ struct btrfs_free_space *next_info_bytes = NULL;
+
+ next_node = rb_next(&info->offset_index);
+ if (next_node)
+ next_info_offset = rb_entry(next_node, struct btrfs_free_space,
+     offset_index);
+ next_node = rb_next(&info->bytes_index);
+ if (next_node)
+ next_info_bytes = rb_entry(next_node, struct btrfs_free_space,
+    bytes_index);
+ ASSERT(info->bytes || info->bitmap);
+
+ if (next_info_offset && next_info_bytes &&
+     unlikely(info->offset > next_info_offset->offset) &&
+     (info->bytes < next_info_bytes->bytes)) {
+ rb_erase(&info->offset_index, &ctl->free_space_offset);
+ tree_insert_offset(&ctl->free_space_offset, info->offset,
+    &info->offset_index, (info->bitmap != NULL));
+
+ rb_erase_cached(&info->bytes_index, &ctl->free_space_bytes);
+ rb_add_cached(&info->bytes_index, &ctl->free_space_bytes,
+       entry_less);
+
+ } else if (next_info_offset &&
+    unlikely(info->offset > next_info_offset->offset)) {
+ rb_erase(&info->offset_index, &ctl->free_space_offset);
+ tree_insert_offset(&ctl->free_space_offset, info->offset,
+    &info->offset_index, (info->bitmap != NULL));
+
+ } else if (next_info_bytes && info->bytes < next_info_bytes->bytes) {
+ rb_erase_cached(&info->bytes_index, &ctl->free_space_bytes);
+ rb_add_cached(&info->bytes_index, &ctl->free_space_bytes,
+       entry_less);
+ }
+}
+
static void relink_bitmap_entry(struct btrfs_free_space_ctl *ctl,
struct btrfs_free_space *info)
{
@@ -3093,7 +3136,6 @@ u64 btrfs_find_space_for_alloc(struct btrfs_block_group *block_group,
if (!entry->bytes)
free_bitmap(ctl, entry);
} else {
- unlink_free_space(ctl, entry, true);
align_gap_len = offset - entry->offset;
align_gap = entry->offset;
align_gap_trim_state = entry->trim_state;
@@ -3105,10 +3147,11 @@ u64 btrfs_find_space_for_alloc(struct btrfs_block_group *block_group,
WARN_ON(entry->bytes < bytes + align_gap_len);

entry->bytes -= bytes + align_gap_len;
- if (!entry->bytes)
+ if (!entry->bytes) {
+ unlink_free_space(ctl, entry, true);
kmem_cache_free(btrfs_free_space_cachep, entry);
- else
- link_free_space(ctl, entry);
+ } else
+ relink_free_space_entry(ctl, entry);
}
out:
btrfs_discard_update_discardable(block_group);
--
2.30.2

             reply	other threads:[~2023-02-03  0:58 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-03  0:58 liuwf [this message]
2023-02-03 11:57 ` [RFC PATCH 1/1] btrfs:Try to reduce the operating frequency of free space cache trees when allocating unclustered Johannes Thumshirn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=NNJm6i---3-9@tutanota.com \
    --to=liuwf@tutanota.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.