* [PATCH v10 1/3] btrfs: add btrfs_strmatch helper
2020-10-28 13:14 [PATCH v10 resend 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
@ 2020-10-28 13:14 ` Anand Jain
2020-10-29 18:35 ` David Sterba
2020-10-28 13:14 ` [PATCH v10 2/3] btrfs: create read policy framework Anand Jain
2020-10-28 13:14 ` [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
2 siblings, 1 reply; 9+ messages in thread
From: Anand Jain @ 2020-10-28 13:14 UTC (permalink / raw)
To: linux-btrfs; +Cc: David Sterba, Josef Bacik
Add a generic helper to match the golden-string in the given-string,
and ignore the leading and trailing whitespaces if any.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Suggested-by: David Sterba <dsterba@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v10: return bool instead of int.
drop unnecessary local variable and ( ) with in if.
v9: use Josef suggested C coding style, using single if statement.
v5: born
fs/btrfs/sysfs.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index fcd6c7a9bbd1..5955379d3d9e 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -888,6 +888,24 @@ static ssize_t btrfs_generation_show(struct kobject *kobj,
}
BTRFS_ATTR(, generation, btrfs_generation_show);
+/*
+ * Match the %golden in the %given. Ignore the leading and trailing whitespaces
+ * if any.
+ */
+static bool btrfs_strmatch(const char *given, const char *golden)
+{
+ size_t len = strlen(golden);
+
+ /* skip leading whitespace */
+ given = skip_spaces(given);
+
+ if (strncmp(given, golden, len) == 0 &&
+ !strlen(skip_spaces(given + len)))
+ return true;
+
+ return false;
+}
+
static const struct attribute *btrfs_attrs[] = {
BTRFS_ATTR_PTR(, label),
BTRFS_ATTR_PTR(, nodesize),
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v10 1/3] btrfs: add btrfs_strmatch helper
2020-10-28 13:14 ` [PATCH v10 1/3] btrfs: add btrfs_strmatch helper Anand Jain
@ 2020-10-29 18:35 ` David Sterba
0 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2020-10-29 18:35 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, David Sterba, Josef Bacik
On Wed, Oct 28, 2020 at 09:14:45PM +0800, Anand Jain wrote:
> Add a generic helper to match the golden-string in the given-string,
> and ignore the leading and trailing whitespaces if any.
>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> Suggested-by: David Sterba <dsterba@suse.com>
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> ---
> v10: return bool instead of int.
> drop unnecessary local variable and ( ) with in if.
> v9: use Josef suggested C coding style, using single if statement.
> v5: born
> fs/btrfs/sysfs.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index fcd6c7a9bbd1..5955379d3d9e 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -888,6 +888,24 @@ static ssize_t btrfs_generation_show(struct kobject *kobj,
> }
> BTRFS_ATTR(, generation, btrfs_generation_show);
>
> +/*
> + * Match the %golden in the %given. Ignore the leading and trailing whitespaces
> + * if any.
> + */
> +static bool btrfs_strmatch(const char *given, const char *golden)
> +{
> + size_t len = strlen(golden);
> +
> + /* skip leading whitespace */
Comments should start with uppercase unless it's an identifier.
> + given = skip_spaces(given);
> +
> + if (strncmp(given, golden, len) == 0 &&
> + !strlen(skip_spaces(given + len)))
You had strlen() == 0 instead of !strlen in v9, the == 0 is better here.
I'll fix that locally and push to misc-next so we can proceed with the
actual policies.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v10 2/3] btrfs: create read policy framework
2020-10-28 13:14 [PATCH v10 resend 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
2020-10-28 13:14 ` [PATCH v10 1/3] btrfs: add btrfs_strmatch helper Anand Jain
@ 2020-10-28 13:14 ` Anand Jain
2020-10-29 18:53 ` David Sterba
2020-10-28 13:14 ` [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
2 siblings, 1 reply; 9+ messages in thread
From: Anand Jain @ 2020-10-28 13:14 UTC (permalink / raw)
To: linux-btrfs; +Cc: Josef Bacik
As of now, we use the %pid method to read striped mirrored data, which means
process id determines the stripe id to read. This type of routing
typically helps in a system with many small independent processes tying
to read random data. On the other hand, the %pid based read IO policy is
inefficient because if there is a single process trying to read a large
file, the overall disk bandwidth remains under-utilized.
So this patch introduces a read policy framework so that we could add more
read policies, such as IO routing based on the device's wait-queue or manual
when we have a read-preferred device or a policy based on the target
storage caching.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v10: -
v9: -
v8: use fallthrough;
v7: Fix missing /* fall through */ in the switch
Removed Reviewed-by: Josef Bacik <josef@toxicpanda.com>
v6:-
v5: Title renamed from:- btrfs: add read_policy framework
Change log updated.
Unnecessary comment dropped, added more where necessary.
Optimize code in the switch remove duplicate code.
Define BTRFS_READ_POLICY_DEFAULT dropped.
Rename enum btrfs_read_policy_type to enum btrfs_read_policy.
Rename BTRFS_READ_BY_PID to BTRFS_READ_POLICY_PID.
(As its mainly renames. Reviewed-by retained).
v4: -
v3: Declare fs_devices::readmirror as enum btrfs_readmirror_policy_type
v2: Declare fs_devices::readmirror as u8 instead of atomic_t
A small change in comment and change log wordings.
fs/btrfs/volumes.c | 15 ++++++++++++++-
fs/btrfs/volumes.h | 14 ++++++++++++++
2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b18e026424a6..6bf487626f23 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1223,6 +1223,7 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
fs_devices->latest_bdev = latest_dev->bdev;
fs_devices->total_rw_bytes = 0;
fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
+ fs_devices->read_policy = BTRFS_READ_POLICY_PID;
return 0;
}
@@ -5485,7 +5486,19 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
else
num_stripes = map->num_stripes;
- preferred_mirror = first + current->pid % num_stripes;
+ switch (fs_info->fs_devices->read_policy) {
+ default:
+ /*
+ * Shouldn't happen, just warn and use pid instead of failing.
+ */
+ btrfs_warn_rl(fs_info,
+ "unknown read_policy type %u, fallback to pid",
+ fs_info->fs_devices->read_policy);
+ fallthrough;
+ case BTRFS_READ_POLICY_PID:
+ preferred_mirror = first + current->pid % num_stripes;
+ break;
+ }
if (dev_replace_is_ongoing &&
fs_info->dev_replace.cont_reading_from_srcdev_mode ==
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 232f02bd214f..97f075516696 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -211,6 +211,15 @@ enum btrfs_chunk_allocation_policy {
BTRFS_CHUNK_ALLOC_REGULAR,
};
+/*
+ * Read policies for the mirrored block groups, read picks the stripe based
+ * on these policies.
+ */
+enum btrfs_read_policy {
+ BTRFS_READ_POLICY_PID,
+ BTRFS_NR_READ_POLICY,
+};
+
struct btrfs_fs_devices {
u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */
u8 metadata_uuid[BTRFS_FSID_SIZE];
@@ -264,6 +273,11 @@ struct btrfs_fs_devices {
struct completion kobj_unregister;
enum btrfs_chunk_allocation_policy chunk_alloc_policy;
+
+ /*
+ * policy used to read the mirrored stripes
+ */
+ enum btrfs_read_policy read_policy;
};
#define BTRFS_BIO_INLINE_CSUM_SIZE 64
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v10 2/3] btrfs: create read policy framework
2020-10-28 13:14 ` [PATCH v10 2/3] btrfs: create read policy framework Anand Jain
@ 2020-10-29 18:53 ` David Sterba
0 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2020-10-29 18:53 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, Josef Bacik
On Wed, Oct 28, 2020 at 09:14:46PM +0800, Anand Jain wrote:
> @@ -5485,7 +5486,19 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
> else
> num_stripes = map->num_stripes;
>
> - preferred_mirror = first + current->pid % num_stripes;
> + switch (fs_info->fs_devices->read_policy) {
> + default:
> + /*
> + * Shouldn't happen, just warn and use pid instead of failing.
> + */
> + btrfs_warn_rl(fs_info,
> + "unknown read_policy type %u, fallback to pid",
> + fs_info->fs_devices->read_policy);
So this would flood the system log with useless messages once this
happens. We should really reset it to pid or whatever default we choose.
Fixed.
> + fallthrough;
> + case BTRFS_READ_POLICY_PID:
> + preferred_mirror = first + current->pid % num_stripes;
This is in the original code, but this should really use ( ) to make the
precedence clear.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid
2020-10-28 13:14 [PATCH v10 resend 0/3] readmirror feature (read_policy sysfs and in-memory only approach) Anand Jain
2020-10-28 13:14 ` [PATCH v10 1/3] btrfs: add btrfs_strmatch helper Anand Jain
2020-10-28 13:14 ` [PATCH v10 2/3] btrfs: create read policy framework Anand Jain
@ 2020-10-28 13:14 ` Anand Jain
2020-10-29 16:45 ` David Sterba
2 siblings, 1 reply; 9+ messages in thread
From: Anand Jain @ 2020-10-28 13:14 UTC (permalink / raw)
To: linux-btrfs; +Cc: Josef Bacik
Add
/sys/fs/btrfs/UUID/read_policy
attribute so that the read policy for the raid1, raid1c34 and raid10 can
be tuned.
When this attribute is read, it shall show all available policies, with
active policy being with in [ ]. The read_policy attribute can be written
using one of the items listed in there.
For example:
$cat /sys/fs/btrfs/UUID/read_policy
[pid]
$echo pid > /sys/fs/btrfs/UUID/read_policy
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
v10: use scnprintf instead of snprintf
v9: fix C coding style, static const char*
v5:
Title rename: old: btrfs: sysfs, add read_policy attribute
Uses the btrfs_strmatch() helper (BTRFS_READ_POLICY_NAME_MAX dropped).
Use the table for the policy names.
Rename len to ret.
Use a simple logic to prefix space in btrfs_read_policy_show()
Reviewed-by: Josef Bacik <josef@toxicpanda.com> dropped.
v4:-
v3: rename [by_pid] to [pid]
v2: v2: check input len before strip and kstrdup
fs/btrfs/sysfs.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 5955379d3d9e..4dbf90ff088a 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -906,6 +906,54 @@ static bool btrfs_strmatch(const char *given, const char *golden)
return false;
}
+static const char * const btrfs_read_policy_name[] = { "pid" };
+
+static ssize_t btrfs_read_policy_show(struct kobject *kobj,
+ struct kobj_attribute *a, char *buf)
+{
+ int i;
+ ssize_t ret = 0;
+ struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
+
+ for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
+ if (fs_devices->read_policy == i)
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%s[%s]",
+ (ret == 0 ? "" : " "),
+ btrfs_read_policy_name[i]);
+ else
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%s%s",
+ (ret == 0 ? "" : " "),
+ btrfs_read_policy_name[i]);
+ }
+
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, "\n");
+
+ return ret;
+}
+
+static ssize_t btrfs_read_policy_store(struct kobject *kobj,
+ struct kobj_attribute *a,
+ const char *buf, size_t len)
+{
+ int i;
+ struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
+
+ for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
+ if (btrfs_strmatch(buf, btrfs_read_policy_name[i])) {
+ if (i != fs_devices->read_policy) {
+ fs_devices->read_policy = i;
+ btrfs_info(fs_devices->fs_info,
+ "read policy set to '%s'",
+ btrfs_read_policy_name[i]);
+ }
+ return len;
+ }
+ }
+
+ return -EINVAL;
+}
+BTRFS_ATTR_RW(, read_policy, btrfs_read_policy_show, btrfs_read_policy_store);
+
static const struct attribute *btrfs_attrs[] = {
BTRFS_ATTR_PTR(, label),
BTRFS_ATTR_PTR(, nodesize),
@@ -916,6 +964,7 @@ static const struct attribute *btrfs_attrs[] = {
BTRFS_ATTR_PTR(, checksum),
BTRFS_ATTR_PTR(, exclusive_operation),
BTRFS_ATTR_PTR(, generation),
+ BTRFS_ATTR_PTR(, read_policy),
NULL,
};
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid
2020-10-28 13:14 ` [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid Anand Jain
@ 2020-10-29 16:45 ` David Sterba
2020-10-29 19:30 ` Anand Jain
0 siblings, 1 reply; 9+ messages in thread
From: David Sterba @ 2020-10-29 16:45 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs, Josef Bacik
On Wed, Oct 28, 2020 at 09:14:47PM +0800, Anand Jain wrote:
> +static ssize_t btrfs_read_policy_store(struct kobject *kobj,
> + struct kobj_attribute *a,
> + const char *buf, size_t len)
> +{
> + int i;
> + struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
> +
> + for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
> + if (btrfs_strmatch(buf, btrfs_read_policy_name[i])) {
Does sysfs guarantee that the buf is nul terminated string or that it
contains a null at all? Because if not, the skip_space step of strmatch
could run out of the buffer.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid
2020-10-29 16:45 ` David Sterba
@ 2020-10-29 19:30 ` Anand Jain
2020-10-29 19:30 ` David Sterba
0 siblings, 1 reply; 9+ messages in thread
From: Anand Jain @ 2020-10-29 19:30 UTC (permalink / raw)
To: dsterba, linux-btrfs, Josef Bacik
On 30/10/20 12:45 am, David Sterba wrote:
> On Wed, Oct 28, 2020 at 09:14:47PM +0800, Anand Jain wrote:
>> +static ssize_t btrfs_read_policy_store(struct kobject *kobj,
>> + struct kobj_attribute *a,
>> + const char *buf, size_t len)
>> +{
>> + int i;
>> + struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
>> +
>> + for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
>> + if (btrfs_strmatch(buf, btrfs_read_policy_name[i])) {
>
> Does sysfs guarantee that the buf is nul terminated string or that it
> contains a null at all? Because if not, the skip_space step of strmatch
> could run out of the buffer.
>
It does
[ 173.555507] ? btrfs_read_policy_store+0x3e/0x12d [btrfs]
[ 173.555541] ? kobj_attr_store+0x16/0x30
[ 173.555562] ? sysfs_kf_write+0x54/0x80
[ 173.555582] ? kernfs_fop_write+0xfa/0x290
[ 173.555611] ? vfs_write+0xee/0x2f0
[ 173.555641] ? ksys_write+0x80/0x170
[ 173.555671] ? __x64_sys_write+0x1e/0x30
[ 173.555692] ? do_syscall_64+0x4b/0x80
[ 173.555708] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
static ssize_t kernfs_fop_write(struct file *file, const char __user
*user_buf,
size_t count, loff_t *ppos)
{
::
buf[len] = '\0'; /* guarantee string termination */
Thanks, Anand
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH v10 3/3] btrfs: create read policy sysfs attribute, pid
2020-10-29 19:30 ` Anand Jain
@ 2020-10-29 19:30 ` David Sterba
0 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2020-10-29 19:30 UTC (permalink / raw)
To: Anand Jain; +Cc: dsterba, linux-btrfs, Josef Bacik
On Fri, Oct 30, 2020 at 03:30:10AM +0800, Anand Jain wrote:
>
>
> On 30/10/20 12:45 am, David Sterba wrote:
> > On Wed, Oct 28, 2020 at 09:14:47PM +0800, Anand Jain wrote:
> >> +static ssize_t btrfs_read_policy_store(struct kobject *kobj,
> >> + struct kobj_attribute *a,
> >> + const char *buf, size_t len)
> >> +{
> >> + int i;
> >> + struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
> >> +
> >> + for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
> >> + if (btrfs_strmatch(buf, btrfs_read_policy_name[i])) {
> >
> > Does sysfs guarantee that the buf is nul terminated string or that it
> > contains a null at all? Because if not, the skip_space step of strmatch
> > could run out of the buffer.
> >
>
> It does
>
> [ 173.555507] ? btrfs_read_policy_store+0x3e/0x12d [btrfs]
> [ 173.555541] ? kobj_attr_store+0x16/0x30
> [ 173.555562] ? sysfs_kf_write+0x54/0x80
> [ 173.555582] ? kernfs_fop_write+0xfa/0x290
> [ 173.555611] ? vfs_write+0xee/0x2f0
> [ 173.555641] ? ksys_write+0x80/0x170
> [ 173.555671] ? __x64_sys_write+0x1e/0x30
> [ 173.555692] ? do_syscall_64+0x4b/0x80
> [ 173.555708] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
>
> static ssize_t kernfs_fop_write(struct file *file, const char __user
> *user_buf,
> size_t count, loff_t *ppos)
> {
> ::
> buf[len] = '\0'; /* guarantee string termination */
Ok thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread