From: Goffredo Baroncelli <kreijack@libero.it>
To: linux-btrfs@vger.kernel.org
Cc: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
Josef Bacik <josef@toxicpanda.com>,
Goffredo Baroncelli <kreijack@inwind.it>
Subject: [PATCH 5/5] btrfs: add allocator_hint mode
Date: Mon, 1 Feb 2021 22:28:20 +0100 [thread overview]
Message-ID: <20210201212820.64381-6-kreijack@libero.it> (raw)
In-Reply-To: <20210201212820.64381-1-kreijack@libero.it>
From: Goffredo Baroncelli <kreijack@inwind.it>
When this mode is enabled, the chunk allocation policy is modified as follow.
Each disk may have a different tag:
- BTRFS_DEV_ALLOCATION_PREFERRED_METADATA
- BTRFS_DEV_ALLOCATION_METADATA_ONLY
- BTRFS_DEV_ALLOCATION_DATA_ONLY
- BTRFS_DEV_ALLOCATION_PREFERRED_DATA (default)
Where:
- ALLOCATION_PREFERRED_X means that it is preferred to use this disk for the
X chunk type (the other type may be allowed when the space is low)
- ALLOCATION_X_ONLY means that it is used *only* for the X chunk type. This
means also that it is a preferred choice.
Each time the allocator allocates a chunk of type X , first it takes the disks
tagged as ALLOCATION_X_ONLY or ALLOCATION_PREFERRED_X; if the space is not
enough, it uses also the disks tagged as ALLOCATION_METADATA_ONLY; if the space
is not enough, it uses also the other disks, with the exception of the one
marked as ALLOCATION_PREFERRED_Y, where Y the other type of chunk (i.e. not X).
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
fs/btrfs/volumes.c | 81 +++++++++++++++++++++++++++++++++++++++++++++-
fs/btrfs/volumes.h | 1 +
2 files changed, 81 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 68b346c5465d..57ee3e2fdac0 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4806,13 +4806,18 @@ static int btrfs_add_system_chunk(struct btrfs_fs_info *fs_info,
}
/*
- * sort the devices in descending order by max_avail, total_avail
+ * sort the devices in descending order by alloc_hint,
+ * max_avail, total_avail
*/
static int btrfs_cmp_device_info(const void *a, const void *b)
{
const struct btrfs_device_info *di_a = a;
const struct btrfs_device_info *di_b = b;
+ if (di_a->alloc_hint > di_b->alloc_hint)
+ return -1;
+ if (di_a->alloc_hint < di_b->alloc_hint)
+ return 1;
if (di_a->max_avail > di_b->max_avail)
return -1;
if (di_a->max_avail < di_b->max_avail)
@@ -4939,6 +4944,15 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
int ndevs = 0;
u64 max_avail;
u64 dev_offset;
+ int hint;
+
+ static const char alloc_hint_map[BTRFS_DEV_ALLOCATION_MASK_COUNT] = {
+ [BTRFS_DEV_ALLOCATION_DATA_ONLY] = -1,
+ [BTRFS_DEV_ALLOCATION_PREFERRED_DATA] = 0,
+ [BTRFS_DEV_ALLOCATION_METADATA_ONLY] = 1,
+ [BTRFS_DEV_ALLOCATION_PREFERRED_METADATA] = 2
+ /* the other values are set to 0 */
+ };
/*
* in the first pass through the devices list, we gather information
@@ -4991,16 +5005,81 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
devices_info[ndevs].max_avail = max_avail;
devices_info[ndevs].total_avail = total_avail;
devices_info[ndevs].dev = device;
+
+ if (((ctl->type & BTRFS_BLOCK_GROUP_DATA) &&
+ (ctl->type & BTRFS_BLOCK_GROUP_METADATA)) ||
+ info->allocation_hint_mode ==
+ BTRFS_ALLOCATION_HINT_DISABLED) {
+ /*
+ * if mixed bg or the allocator hint is
+ * disable, set all the alloc_hint
+ * fields to the same value, so the sorting
+ * is not affected
+ */
+ devices_info[ndevs].alloc_hint = 0;
+ } else if(ctl->type & BTRFS_BLOCK_GROUP_DATA) {
+ hint = device->type & BTRFS_DEV_ALLOCATION_MASK;
+
+ /*
+ * skip BTRFS_DEV_METADATA_ONLY disks
+ */
+ if (hint == BTRFS_DEV_ALLOCATION_METADATA_ONLY)
+ continue;
+ /*
+ * if a data chunk must be allocated,
+ * sort also by hint (data disk
+ * higher priority)
+ */
+ devices_info[ndevs].alloc_hint = -alloc_hint_map[hint];
+ } else { /* BTRFS_BLOCK_GROUP_METADATA */
+ hint = device->type & BTRFS_DEV_ALLOCATION_MASK;
+
+ /*
+ * skip BTRFS_DEV_DATA_ONLY disks
+ */
+ if (hint == BTRFS_DEV_ALLOCATION_DATA_ONLY)
+ continue;
+ /*
+ * if a data chunk must be allocated,
+ * sort also by hint (metadata hint
+ * higher priority)
+ */
+ devices_info[ndevs].alloc_hint = alloc_hint_map[hint];
+ }
+
++ndevs;
}
ctl->ndevs = ndevs;
+ /*
+ * no devices available
+ */
+ if (!ndevs)
+ return 0;
+
/*
* now sort the devices by hole size / available space
*/
sort(devices_info, ndevs, sizeof(struct btrfs_device_info),
btrfs_cmp_device_info, NULL);
+ /*
+ * select the minimum set of disks grouped by hint that
+ * can host the chunk
+ */
+ ndevs = 0;
+ while (ndevs < ctl->ndevs) {
+ hint = devices_info[ndevs++].alloc_hint;
+ while (devices_info[ndevs].alloc_hint == hint &&
+ ndevs < ctl->ndevs)
+ ndevs++;
+ if (ndevs >= ctl->devs_min)
+ break;
+ }
+
+ BUG_ON(ndevs > ctl->ndevs);
+ ctl->ndevs = ndevs;
+
return 0;
}
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index d776b7f55d56..31a3e4cf93b5 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -364,6 +364,7 @@ struct btrfs_device_info {
u64 dev_offset;
u64 max_avail;
u64 total_avail;
+ int alloc_hint;
};
struct btrfs_raid_attr {
--
2.30.0
next prev parent reply other threads:[~2021-02-01 21:29 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-01 21:28 [RFC][PATCH V6] btrfs: allocation_hint mode Goffredo Baroncelli
2021-02-01 21:28 ` [PATCH 1/5] btrfs: add ioctl BTRFS_IOC_DEV_PROPERTIES Goffredo Baroncelli
2021-02-10 16:08 ` Josef Bacik
2021-02-11 18:47 ` Goffredo Baroncelli
2021-02-01 21:28 ` [PATCH 2/5] btrfs: add flags to give an hint to the chunk allocator Goffredo Baroncelli
2021-02-10 16:09 ` Josef Bacik
2021-02-11 18:47 ` Goffredo Baroncelli
2021-02-01 21:28 ` [PATCH 3/5] btrfs: export dev_item.type in /sys/fs/btrfs/<uuid>/devinfo/<devid>/type Goffredo Baroncelli
2021-02-01 21:28 ` [PATCH 4/5] btrfs: add allocation_hint option Goffredo Baroncelli
2021-02-10 16:14 ` Josef Bacik
2021-02-11 18:46 ` Goffredo Baroncelli
2021-02-01 21:28 ` Goffredo Baroncelli [this message]
2021-02-04 23:24 ` [PATCH 5/5] btrfs: add allocator_hint mode Zygo Blaxell
2021-02-05 18:01 ` Goffredo Baroncelli
2021-02-10 16:12 ` Josef Bacik
2021-02-11 18:46 ` Goffredo Baroncelli
2021-02-19 18:51 ` Goffredo Baroncelli
2021-02-10 16:04 ` [RFC][PATCH V6] btrfs: allocation_hint mode Josef Bacik
2021-02-11 18:47 ` Goffredo Baroncelli
2021-02-11 18:58 ` Josef Bacik
2021-02-16 22:27 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210201212820.64381-6-kreijack@libero.it \
--to=kreijack@libero.it \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=josef@toxicpanda.com \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox