Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Anand Jain <anand.jain@oracle.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC 3/7] btrfs: add read_policy latency
Date: Tue, 27 Oct 2020 14:20:16 -0400	[thread overview]
Message-ID: <e3a90960-8edd-6dfd-7b82-2c5c9a68874b@toxicpanda.com> (raw)
In-Reply-To: <de0a28ed406c84c84d40d4bdad5f45250aabfdea.1603751876.git.anand.jain@oracle.com>

On 10/26/20 7:55 PM, Anand Jain wrote:
> The read policy type latency routes the read IO based on the historical
> average wait time experienced by the read IOs through the individual
> device factored by 1/10 of inflight commands in the queue. The factor
> 1/10 is because generally the block device queue depth is more than 1,
> so there can be commands in the queue even before the previous commands
> have been completed. This patch obtains the historical read IO stats from
> the kernel block layer.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>   fs/btrfs/sysfs.c   |  3 +-
>   fs/btrfs/volumes.c | 74 +++++++++++++++++++++++++++++++++++++++++++++-
>   fs/btrfs/volumes.h |  1 +
>   3 files changed, 76 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index d159f7c70bcd..6690abeeb889 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -874,7 +874,8 @@ static int btrfs_strmatch(const char *given, const char *golden)
>   	return -EINVAL;
>   }
>   
> -static const char * const btrfs_read_policy_name[] = { "pid" };
> +/* Must follow the order as in enum btrfs_read_policy */
> +static const char * const btrfs_read_policy_name[] = { "pid", "latency"};
>   
>   static ssize_t btrfs_read_policy_show(struct kobject *kobj,
>   				      struct kobj_attribute *a, char *buf)
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index da31b11ceb61..9bab6080cebf 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -14,6 +14,7 @@
>   #include <linux/semaphore.h>
>   #include <linux/uuid.h>
>   #include <linux/list_sort.h>
> +#include <linux/part_stat.h>
>   #include "misc.h"
>   #include "ctree.h"
>   #include "extent_map.h"
> @@ -5465,6 +5466,66 @@ int btrfs_is_parity_mirror(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
>   	return ret;
>   }
>   
> +static u64 btrfs_estimate_read(struct btrfs_device *device,
> +			       unsigned long *inflight)
> +{
> +	u64 read_wait;
> +	u64 avg_wait = 0;
> +	unsigned long read_ios;
> +	struct disk_stats stat;
> +
> +	/* Commands in flight on this partition/device */
> +	*inflight = part_stat_read_inflight(bdev_get_queue(device->bdev),
> +					    device->bdev->bd_part);
> +	part_stat_read_all(device->bdev->bd_part, &stat);
> +
> +	read_wait = stat.nsecs[STAT_READ];
> +	read_ios = stat.ios[STAT_READ];
> +
> +	if (read_wait && read_ios && read_wait >= read_ios)
> +		avg_wait = div_u64(read_wait, read_ios);
> +	else
> +		btrfs_info_rl(device->fs_devices->fs_info,
> +			"devid: %llu avg_wait ZERO read_wait %llu read_ios %lu",
> +			      device->devid, read_wait, read_ios);
> +
> +	return avg_wait;
> +}
> +
> +static int btrfs_find_best_stripe(struct btrfs_fs_info *fs_info,
> +				  struct map_lookup *map, int first,
> +				  int num_stripe)
> +{
> +	int index;
> +	int best_stripe = 0;
> +	int est_wait = -EINVAL;
> +	int last = first + num_stripe;
> +	unsigned long inflight;
> +
> +	for (index = first; index < last; index++) {
> +		struct btrfs_device *device = map->stripes[index].dev;
> +
> +		if (!blk_queue_io_stat(bdev_get_queue(device->bdev)))
> +			return -ENOENT;
> +	}
> +
> +	for (index = first; index < last; index++) {
> +		struct btrfs_device *device = map->stripes[index].dev;
> +		u64 avg_wait;
> +		u64 final_wait;
> +
> +		avg_wait = btrfs_estimate_read(device, &inflight);
> +		final_wait = avg_wait + (avg_wait * (inflight / 10));

Inflight is going to lag because it's only going to account for bio's that 
actually have been attached to requests here.  Since we're already on fuzzy 
ground, why not just skip the inflight and go with the average latencies.  If we 
heavily load one side it's latencies will creep up and then we'll favor the 
other side.  If we want to really aim for the lowest latency, we could add our 
own inflight counter to the stripe itself, and this would account for actual 
IO's that we have inflight currently, and would be much less fuzzy than relying 
on the block inflight counters.  Thanks,

Josef

  reply	other threads:[~2020-10-27 18:25 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-26 23:55 [PATCH RFC 0/7] btrfs: read_policy types latency, device and Anand Jain
2020-10-26 23:55 ` [PATCH RFC 1/7] block: export part_stat_read_all Anand Jain
2020-10-27 18:09   ` Josef Bacik
2020-10-28  8:26     ` Anand Jain
2020-10-26 23:55 ` [PATCH RFC 2/7] block: export part_stat_read_inflight Anand Jain
2020-10-27 18:10   ` Josef Bacik
2020-10-28  8:32     ` Anand Jain
2020-10-26 23:55 ` [PATCH RFC 3/7] btrfs: add read_policy latency Anand Jain
2020-10-27 18:20   ` Josef Bacik [this message]
2020-10-26 23:55 ` [PATCH RFC 4/7] btrfs: trace, add event btrfs_read_policy Anand Jain
2020-10-27 18:22   ` Josef Bacik
2020-10-28  8:59     ` Anand Jain
2020-10-28 12:41       ` Josef Bacik
2020-10-26 23:55 ` [PATCH 5/7] btrfs: introduce new device-state read_preferred Anand Jain
2020-10-26 23:55 ` [PATCH 6/7] btrfs: introduce new read_policy device Anand Jain
2020-10-27  7:11   ` Wang Yugui
2020-10-26 23:55 ` [PATCH RFC 7/7] btrfs: introduce new read_policy round-robin Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3a90960-8edd-6dfd-7b82-2c5c9a68874b@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox