From: Wang Yugui <wangyugui@e16-tech.com>
To: Anand Jain <anand.jain@oracle.com>
Cc: linux-btrfs@vger.kernel.org, louis@waffle.tech
Subject: Re: [PATCH 6/7] btrfs: introduce new read_policy device
Date: Tue, 27 Oct 2020 15:11:59 +0800 [thread overview]
Message-ID: <20201027151158.C1D4.409509F4@e16-tech.com> (raw)
In-Reply-To: <eacd759d436260ccd586d52c9d2500e63b4aa614.1603751876.git.anand.jain@oracle.com>
[-- Attachment #1: Type: text/plain, Size: 1341 bytes --]
Hi, Anand Jain
Cc: Louis Jencka
> Read-policy type 'device' and device flag 'read-preferred':
>
> The read-policy type device picks the device(s) flagged as
> read-preferred for reading chunks of type raid1, raid10,
> raid1c3 and raid1c4.
>
> A system might contain SSD, nvme, iscsi or san lun, and which are all
> a non-rotational device, so it is not a good idea to set the read-preferred
> automatically. Instead device read-policy along with the read-preferred
> flag provides an ability to do it manually. This advance tuning is
> useful in more than one situation, for example,
> - In heterogeneous-disk volume, it provides an ability to manually choose
> the low latency disks for reading.
> - Useful for more accurate testing.
> - Avoid known problematic device from reading the chunk until it is
> replaced (by marking the other good devices as read-preferred).
It is still OK to auto for the most common case of the mixed of ssd and
hdd?
I am trying 'manually if failed to auto' with a 'u8' var rather than a 'bool'
var.
There are 2 patch I am working but yet not completed.
and someone of them is based on 'btrfs: balance RAID1/RAID10 mirror
selection' from Louis Jencka louis@waffle.tech
Feel free to merge them into your patch as a new one please.
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2020/10/27
[-- Attachment #2: 0001-btrfs-add-tier-score-to-device.patch --]
[-- Type: application/octet-stream, Size: 2530 bytes --]
From 8a8f6405073f835531664aafa333570fba28c31f Mon Sep 17 00:00:00 2001
From: wangyugui <wangyugui@e16-tech.com>
Date: Tue, 27 Oct 2020 08:14:46 +0800
Subject: [PATCH 1/3] btrfs: add tier score to device
We use a single score value to define the tier level of a device.
Different score means different tier, and bigger is faster.
DAX device(dax=1)
SSD device(rotational=0)
HDD device(rotational=1)
TODO/FIXME: Different bus(DIMM/NVMe/SAS/SATA/VirtIO/...) support.
TODO/FIXME: Different media detail(SSD MLC/TLC/..; HDD PMR/SMR/...) support.
TODO/FIXME: User-assigned property to mark as the top tier score.
In most case, only 1 or 2 tiers are used at the same time, so we group them into
top tier and other tier(s).
---
fs/btrfs/volumes.c | 18 ++++++++++++++++++
fs/btrfs/volumes.h | 2 ++
2 files changed, 20 insertions(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1997a7d..d767c99 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -608,6 +608,22 @@ static int btrfs_free_stale_devices(const char *path,
return ret;
}
+/*
+ * Get the tier score to the device, higher is faster.
+ * FIXME: detech bus(DIMM/NVMe(40)/SCSI(30)/SATA(20)/..)
+ * FIXME: media detail(SSD SLC/MLC/..,)
+ * FIXME: usre-defined property to set to max score 127
+ */
+static void dev_get_tier_score(struct btrfs_device *device, struct request_queue *q)
+{
+ if (blk_queue_dax(q))
+ device->tier_score = 50;
+ else if (blk_queue_nonrot(q))
+ device->tier_score = 10;
+ else
+ device->tier_score = 0;
+}
+
/*
* This is only used on mount, and we are protected from competing things
* messing with our fs_devices by the uuid_mutex, thus we do not need the
@@ -660,6 +676,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
}
q = bdev_get_queue(bdev);
+ dev_get_tier_score(device,q);
if (!blk_queue_nonrot(q))
fs_devices->rotating = true;
@@ -2598,6 +2615,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
atomic64_add(device->total_bytes, &fs_info->free_chunk_space);
+ dev_get_tier_score(device,q);
if (!blk_queue_nonrot(q))
fs_devices->rotating = true;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 302c923..5ffa429 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -138,6 +138,8 @@ struct btrfs_device {
struct completion kobj_unregister;
/* For sysfs/FSID/devinfo/devid/ */
struct kobject devid_kobj;
+
+ u8 tier_score; /* storage tier_score; higher is faster */
};
/*
--
2.29.1
[-- Attachment #3: 0003-btrfs-tier-aware-mirror-path-select.patch --]
[-- Type: application/octet-stream, Size: 2182 bytes --]
From 4ac2fc0a3be670e0960928012f5f156b50f2c69d Mon Sep 17 00:00:00 2001
From: wangyugui <wangyugui@e16-tech.com>
Date: Tue, 27 Oct 2020 09:33:21 +0800
Subject: [PATCH 3/3] btrfs: tier-aware mirror path select
This extended the patch 'btrfs: balance RAID1/RAID10 mirror selection' from louis@waffle.tech
- add the tier-aware feature
---
fs/btrfs/volumes.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d380b20..cc4a791 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5562,6 +5562,9 @@ int btrfs_is_parity_mirror(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
return ret;
}
+/* Used for round-robin balancing RAID1/RAID10 reads. */
+static atomic_t rr_counter = ATOMIC_INIT(0);
+
static int find_live_mirror(struct btrfs_fs_info *fs_info,
struct map_lookup *map, int first,
int dev_replace_is_ongoing)
@@ -5572,6 +5575,11 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
int tolerance;
struct btrfs_device *srcdev;
+ /* tier-aware */
+ int top_tier_num_stripes;
+ int top_tier_stripe_idxs[4];
+ u8 top_tier_score = 0;
+
ASSERT((map->type &
(BTRFS_BLOCK_GROUP_RAID1_MASK | BTRFS_BLOCK_GROUP_RAID10)));
@@ -5580,7 +5588,29 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
else
num_stripes = map->num_stripes;
- preferred_mirror = first + current->pid % num_stripes;
+ for (i = 0; i < num_stripes; ++i)
+ {
+ if (map->stripes[i].dev->tier_score > top_tier_score)
+ {
+ top_tier_score = map->stripes[i].dev->tier_score;
+ top_tier_stripe_idxs[0] = i;
+ top_tier_num_stripes = 1;
+ }
+ else if (map->stripes[i].dev->tier_score == top_tier_score)
+ {
+ top_tier_stripe_idxs[top_tier_num_stripes] = i;
+ top_tier_num_stripes++;
+ }
+ }
+ preferred_mirror = first;
+ if (top_tier_num_stripes > 1)
+ {
+ preferred_mirror += top_tier_stripe_idxs[((unsigned)atomic_inc_return(&rr_counter)) % top_tier_num_stripes];
+ }
+ else
+ {
+ preferred_mirror += top_tier_stripe_idxs[0];
+ }
if (dev_replace_is_ongoing &&
fs_info->dev_replace.cont_reading_from_srcdev_mode ==
--
2.29.1
next prev parent reply other threads:[~2020-10-27 7:12 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-26 23:55 [PATCH RFC 0/7] btrfs: read_policy types latency, device and Anand Jain
2020-10-26 23:55 ` [PATCH RFC 1/7] block: export part_stat_read_all Anand Jain
2020-10-27 18:09 ` Josef Bacik
2020-10-28 8:26 ` Anand Jain
2020-10-26 23:55 ` [PATCH RFC 2/7] block: export part_stat_read_inflight Anand Jain
2020-10-27 18:10 ` Josef Bacik
2020-10-28 8:32 ` Anand Jain
2020-10-26 23:55 ` [PATCH RFC 3/7] btrfs: add read_policy latency Anand Jain
2020-10-27 18:20 ` Josef Bacik
2020-10-26 23:55 ` [PATCH RFC 4/7] btrfs: trace, add event btrfs_read_policy Anand Jain
2020-10-27 18:22 ` Josef Bacik
2020-10-28 8:59 ` Anand Jain
2020-10-28 12:41 ` Josef Bacik
2020-10-26 23:55 ` [PATCH 5/7] btrfs: introduce new device-state read_preferred Anand Jain
2020-10-26 23:55 ` [PATCH 6/7] btrfs: introduce new read_policy device Anand Jain
2020-10-27 7:11 ` Wang Yugui [this message]
2020-10-26 23:55 ` [PATCH RFC 7/7] btrfs: introduce new read_policy round-robin Anand Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201027151158.C1D4.409509F4@e16-tech.com \
--to=wangyugui@e16-tech.com \
--cc=anand.jain@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=louis@waffle.tech \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox