linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: linux-btrfs@vger.kernel.org
Cc: josef@toxicpanda.com, dsterba@suse.cz
Subject: [PATCH v6 5/5] btrfs: introduce new read_policy device
Date: Wed, 19 Feb 2020 19:29:26 +0800	[thread overview]
Message-ID: <1582111766-8372-6-git-send-email-anand.jain@oracle.com> (raw)
In-Reply-To: <1582111766-8372-1-git-send-email-anand.jain@oracle.com>

A new read policy 'device' is introduced with this patch, which when set
can pick only the device flagged as read_preferred for reading. This
tunable is for the advance users and the testers, which can make sure that
reads are read from the device they prefer for chunks of type raid1,
raid10, raid1c3 and raid1c4.

The default read policy is pid which can be changed to device as below.

$ pwd
/sys/fs/btrfs/12345678-1234-1234-1234-123456789abc

$ cat read_policy; echo device > ./read_policy; cat read_policy
[pid] device
pid [device]

One or more devices which are favored for reading should set the flag
read-preferred. In an example below a typical two disk raid1, devid1 is
configured as read preferred.

$ echo 1 > devinfo/1/read_preferred
$ cat devinfo/1/read_preferred; cat devinfo/2/read_preffered
1
0

So now when the file is read, the read IO would prefer device(s) with
read_preferred flags for reading.

$ echo 3 > /proc/sys/vm/drop_caches; md5sum /btrfs/YkZI

Since the devid 1 (sdb) is our read preferred device, the reads are set
to sdb only.
$ iostat -zy 1 | egrep 'sdb|sdc' (from another terminal)
sdb              50.00     40048.00         0.00      40048          0

$ echo 0 > ./devinfo/1/read_preferred; echo 1 >
./devinfo/2/read_preferred;

[ 3343.918658] BTRFS info (device sdb): reset read preferred on devid 1
(1334)
[ 3343.919876] BTRFS info (device sdb): set read preferred on devid 2
(1334)

$ echo 3 > /proc/sys/vm/drop_caches; md5sum /btrfs/YkZI

Since now we changed the read preferred from devid 1 (sdb) to 2 (sdc),
now all the read IO goes to sdc.

$ iostat -zy 1 | egrep 'sdb|sdc' (from another terminal)
sdc              49.00     40048.00         0.00      40048          0

Whenever there isn't any read preferred device(s) or if more than one
stripe is marked as read preferred device then this read policy shall
use the stripe 0 for reading.

The command
 $ echo pid > ./read_policy
goes back to the pid read policy type.

As of now this is in memory only feature which means after a unmount
mount cycle the configuration will be lost and has to be configured
again.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
v6:
. If there isn't read preferred device in the chunk don't reset
read policy to default, instead just use stripe 0. As this is in
the read path it avoids going through the device list to find
read preferred device. So inline to this drop to check if there
is read preferred device before setting read policy to device.
. Commit log updated. Adds more info about this new feature.

v5: born

 fs/btrfs/sysfs.c   |  3 ++-
 fs/btrfs/volumes.c | 24 ++++++++++++++++++++++++
 fs/btrfs/volumes.h |  1 +
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 72daaedb7b04..af53ed879dd6 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -832,7 +832,8 @@ static int btrfs_strmatch(const char *given, const char *golden)
 	return -EINVAL;
 }
 
-static const char* const btrfs_read_policy_name[] = { "pid" };
+/* Must follow the order as in enum btrfs_read_policy */
+static const char* const btrfs_read_policy_name[] = { "pid", "device" };
 
 static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 				      struct kobj_attribute *a, char *buf)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b6efb87bb0ae..43c09ec0bf86 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5341,6 +5341,26 @@ int btrfs_is_parity_mirror(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
 	return ret;
 }
 
+static int btrfs_find_read_preferred(struct map_lookup *map, int num_stripe)
+{
+	int i;
+
+	/*
+	 * If there are more than one read preferred devices, then just pick the
+	 * first found read preferred device as of now. Once we have the Qdepth
+	 * based device selection, we could pick the least busy device among the
+	 * read preferred devices.
+	 */
+	for (i = 0; i < num_stripe; i++) {
+		if (test_bit(BTRFS_DEV_STATE_READ_PREFERRED,
+			     &map->stripes[i].dev->dev_state))
+			return i;
+        }
+
+	/* If there is no read preferred device then just use stripe 0 */
+	return 0;
+}
+
 static int find_live_mirror(struct btrfs_fs_info *fs_info,
 			    struct map_lookup *map, int first,
 			    int dev_replace_is_ongoing)
@@ -5360,6 +5380,10 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
 		num_stripes = map->num_stripes;
 
 	switch (fs_info->fs_devices->read_policy) {
+	case BTRFS_READ_POLICY_DEVICE:
+		preferred_mirror = btrfs_find_read_preferred(map, num_stripes);
+		preferred_mirror = first + preferred_mirror;
+		break;
 	default:
 		/*
 		 * Shouldn't happen, just warn and use pid instead of failing.
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 07962a0ce898..9c3c6ba7aad5 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -216,6 +216,7 @@ struct btrfs_device {
  */
 enum btrfs_read_policy {
 	BTRFS_READ_POLICY_PID,
+	BTRFS_READ_POLICY_DEVICE,
 	BTRFS_NR_READ_POLICY,
 };
 
-- 
1.8.3.1


  parent reply	other threads:[~2020-02-19 11:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-19 11:29 [PATCH v6 0/5] readmirror feature (sysfs and in-memory only approach; with new read_policy device) Anand Jain
2020-02-19 11:29 ` [PATCH v6 1/5] btrfs: add btrfs_strmatch helper Anand Jain
2020-02-19 11:29 ` [PATCH v6 2/5] btrfs: create read policy framework Anand Jain
2020-02-19 11:29 ` [PATCH v6 3/5] btrfs: create read policy sysfs attribute, pid Anand Jain
2020-02-19 11:29 ` [PATCH v6 4/5] btrfs: introduce new device-state read_preferred Anand Jain
2020-02-19 11:29 ` Anand Jain [this message]
2020-02-19 12:18   ` [PATCH v6 5/5] btrfs: introduce new read_policy device Steven Davies
2020-02-20  3:54     ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1582111766-8372-6-git-send-email-anand.jain@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).