* [PATCH v5 0/1] multipath-tools: Prioritizer based on a latency algorithm @ 2017-06-09 9:58 Yang Feng 2017-06-09 9:58 ` [PATCH v5 1/1] " Yang Feng 0 siblings, 1 reply; 4+ messages in thread From: Yang Feng @ 2017-06-09 9:58 UTC (permalink / raw) To: mwilck, hare, christophe.varoqui, bmarzins, xose.vazquez Cc: zouming.zouming, guanjunxiong, philip.yang, shenhong09, dm-devel, hege09, qiuxin This patch value is in the following: 1. In the Storage-Backup environment of HyperCluster, includes one storage array near to the host and one remote storage array, and the two storage arrays have the same hardware.The same LUN is writed or readed by the two storage arrays. However, usually, the average latency of the paths of the remote storage array is much higher than the near storage array's. Apparently, the prioritizer can be a good automatic solution. And the current selectors don't solve it, IOs will send to the paths of the remote storage array, IOPS will be influenced unavoidably. 2. In the environment of single storage array, the prioritizer can automatically separate the paths who's latency is much higher, IOs will not send to this paths. But the current selectors don't solve this problem, IOPS will be influenced unavoidably. Changes from v4: * Argument "latency_interval" is set by using logarithmic scale, where base number is 10. Fix according to Martin's reviews. * The value of "MAX_LATENCY_INTERVAL" is set 10 from 60. Changes from v3: * Delete Copyright time "2021", Fix according to Xose's reviews. * Add version for GPL, Fix according to Xose's reviews. Changes from v2: * Reorganize the commit comment and patch format. * Added Benjamin, Martin and Xose's Reviewed-by. Changes from v1: * The value of "MIN_IO_NUM" is set 2 from 10. * Fix according to Benjamin, Martin and Xose's reviews. Yang Feng (1): multipath-tools: Prioritizer based on a latency algorithm libmultipath/prio.h | 1 + libmultipath/prioritizers/Makefile | 4 + libmultipath/prioritizers/path_latency.c | 282 +++++++++++++++++++++++++++++++ multipath/multipath.conf.5 | 20 +++ 4 files changed, 307 insertions(+) create mode 100644 libmultipath/prioritizers/path_latency.c -- 2.6.4.windows.1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v5 1/1] multipath-tools: Prioritizer based on a latency algorithm 2017-06-09 9:58 [PATCH v5 0/1] multipath-tools: Prioritizer based on a latency algorithm Yang Feng @ 2017-06-09 9:58 ` Yang Feng 2017-06-14 20:47 ` Martin Wilck 0 siblings, 1 reply; 4+ messages in thread From: Yang Feng @ 2017-06-09 9:58 UTC (permalink / raw) To: mwilck, hare, christophe.varoqui, bmarzins, xose.vazquez Cc: zouming.zouming, guanjunxiong, philip.yang, shenhong09, dm-devel, hege09, qiuxin libmultipath/prioritizers: Prioritizer for device mapper multipath, where the corresponding priority values of specific paths are provided by a latency algorithm. And the latency algorithm is dependent on the following arguments(latency_interval and io_num). The principle of the algorithm is illustrated as follows: 1. By sending a certain number "cons_num" of read IOs to the current path continuously, the IOs' average latency can be calculated. 2. According to the average latency of each path and the weight value "latency_interval", the priority "rc" of each path can be provided. latency_interval latency_interval latency_interval latency_interval |------------------|------------------|------------------|...|------------------| | priority rank 1 | priority rank 2 | priority rank 3 |...| priority rank x | |------------------|------------------|------------------|...|------------------| Priority Rank Partitioning Signed-off-by: Yang Feng <philip.yang@huawei.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.de> --- libmultipath/prio.h | 1 + libmultipath/prioritizers/Makefile | 4 + libmultipath/prioritizers/path_latency.c | 282 +++++++++++++++++++++++++++++++ multipath/multipath.conf.5 | 20 +++ 4 files changed, 307 insertions(+) create mode 100644 libmultipath/prioritizers/path_latency.c diff --git a/libmultipath/prio.h b/libmultipath/prio.h index 0193c52..c97fe39 100644 --- a/libmultipath/prio.h +++ b/libmultipath/prio.h @@ -29,6 +29,7 @@ struct path; #define PRIO_RDAC "rdac" #define PRIO_WEIGHTED_PATH "weightedpath" #define PRIO_SYSFS "sysfs" +#define PRIO_PATH_LATENCY "path_latency" /* * Value used to mark the fact prio was not defined diff --git a/libmultipath/prioritizers/Makefile b/libmultipath/prioritizers/Makefile index 36b42e4..ca47cdf 100644 --- a/libmultipath/prioritizers/Makefile +++ b/libmultipath/prioritizers/Makefile @@ -18,6 +18,7 @@ LIBS = \ libpriorandom.so \ libpriordac.so \ libprioweightedpath.so \ + libpriopath_latency.so \ libpriosysfs.so all: $(LIBS) @@ -25,6 +26,9 @@ all: $(LIBS) libprioalua.so: alua.o alua_rtpg.o $(CC) $(LDFLAGS) $(SHARED_FLAGS) -o $@ $^ +libpriopath_latency.so: path_latency.o ../checkers/libsg.o + $(CC) $(LDFLAGS) $(SHARED_FLAGS) -o $@ $^ -lm + libprio%.so: %.o $(CC) $(LDFLAGS) $(SHARED_FLAGS) -o $@ $^ diff --git a/libmultipath/prioritizers/path_latency.c b/libmultipath/prioritizers/path_latency.c new file mode 100644 index 0000000..e9cf679 --- /dev/null +++ b/libmultipath/prioritizers/path_latency.c @@ -0,0 +1,282 @@ +/* + * (C) Copyright HUAWEI Technology Corp. 2017, All Rights Reserved. + * + * path_latency.c + * + * Prioritizer for device mapper multipath, where the corresponding priority + * values of specific paths are provided by a latency algorithm. And the + * latency algorithm is dependent on arguments. + * + * The principle of the algorithm as follows: + * 1. By sending a certain number "io_num" of read IOs to the current path + * continuously, the IOs' average latency can be calculated. + * 2. According to the average latency of each path and the weight value + * "latency_interval", the priority "rc" of each path can be provided. + * + * Author(s): Yang Feng <philip.yang@huawei.com> + * Guan Junxiong <guanjunxiong@huawei.com> + * Zou Ming <zouming.zouming@huawei.com> + * + * This file is released under the GPL version 2, or any later version. + * + */ +#include <stdio.h> +#include <math.h> +#include <ctype.h> +#include <time.h> + +#include "debug.h" +#include "prio.h" +#include "structs.h" +#include "../checkers/libsg.h" + +#define THRES_USEC_VALUE 120000000LL /*unit: us, =120s*/ +#define BASE_NUMBER 10 + +#define MAX_IO_NUM 200 +#define MIN_IO_NUM 2 + +#define MAX_LATENCY_INTERVAL 10 /*unit: s*/ +#define MIN_LATENCY_INTERVAL 1 /*unit: us, or ms, or s*/ + +#define DEFAULT_PRIORITY 0 + +#define MAX_CHAR_SIZE 30 + +#define CHAR_USEC "us" +#define CHAR_MSEC "ms" +#define CHAR_SEC "s" + +enum interval_type { + INTERVAL_USEC, + INTERVAL_MSEC, + INTERVAL_SEC, + INTERVAL_INVALID +}; + +/* interval_unit_str and interval_unit_type keep the same assignment sequence */ +static const char *interval_unit_str[MAX_CHAR_SIZE] = { + CHAR_USEC, CHAR_MSEC, CHAR_SEC +}; +static const int interval_unit_type[] = { + INTERVAL_USEC, INTERVAL_MSEC, INTERVAL_SEC +}; + +#define USEC_PER_SEC 1000000LL +#define USEC_PER_MSEC 1000LL +#define USEC_PER_USEC 1LL + +static const int conversion_ratio[] = { + [INTERVAL_USEC] = USEC_PER_USEC, + [INTERVAL_MSEC] = USEC_PER_MSEC, + [INTERVAL_SEC] = USEC_PER_SEC, + [INTERVAL_INVALID] = 0 +}; + +static long long path_latency[MAX_IO_NUM]; + +static inline long long timeval_to_us(const struct timespec *tv) +{ + return ((long long) tv->tv_sec * USEC_PER_SEC) + (tv->tv_nsec >> 10); +} + +static int do_readsector0(int fd, unsigned int timeout) +{ + unsigned char buf[4096]; + unsigned char sbuf[SENSE_BUFF_LEN]; + int ret; + + ret = sg_read(fd, &buf[0], 4096, &sbuf[0], + SENSE_BUFF_LEN, timeout); + + return ret; +} + +int check_args_valid(int io_num, long long latency_interval, int type) +{ + if ((io_num < MIN_IO_NUM) || (io_num > MAX_IO_NUM)) + { + condlog(0, "args io_num is more than the valid values range"); + return 0; + } + + /* check value is set by using logarithmic scale, and base number is 10 */ + if ((latency_interval % BASE_NUMBER) != 0) + { + condlog(0, "args latency_interval is not set by using logarithmic scale , where base number is 10"); + return 0; + } + + /* s:[1, 10], ms:[1, 10000], us:[1, 10000000] */ + if ((latency_interval < MIN_LATENCY_INTERVAL) + || (latency_interval > (MAX_LATENCY_INTERVAL * USEC_PER_SEC / conversion_ratio[type]))) + { + condlog(0, "args latency_interval is more than the valid values range"); + return 0; + } + + return 1; +} + +static int get_interval_type(char *type) +{ + int index; + + for (index = 0; index < sizeof(interval_unit_str)/MAX_CHAR_SIZE; index++) + { + if (strcmp(type, interval_unit_str[index]) == 0) + { + return interval_unit_type[index]; + } + } + + return INTERVAL_INVALID; +} + +long long get_conversion_ratio(int type) +{ + return conversion_ratio[type]; +} + +/* In multipath.conf, args form: io_num|latency_interval. For example, +* args is "20|10ms", this function can get 20, 10. +*/ +static int get_interval_and_ionum(char *args, + int *ionum, + long long *interval) +{ + char source[MAX_CHAR_SIZE]; + char vertica = '|'; + char *endstrbefore = NULL; + char *endstrafter = NULL; + int type; + unsigned int size = strlen(args); + long long ratio; + + if ((args == NULL) || (ionum == NULL) || (interval == NULL)) + { + condlog(0, "args string is NULL"); + return 0; + } + + if ((size < 1) || (size > MAX_CHAR_SIZE-1)) + { + condlog(0, "args string's size is too long"); + return 0; + } + + memcpy(source, args, size+1); + + if (!isdigit(source[0])) + { + condlog(0, "args io_num string's first char is not digit"); + return 0; + } + + *ionum = (int)strtoul(source, &endstrbefore, 10); + if (endstrbefore[0] != vertica) + { + condlog(0, "segmentation char is invalid"); + return 0; + } + + if (!isdigit(endstrbefore[1])) + { + condlog(0, "args latency_interval string's first char is not digit"); + return 0; + } + + *interval = (long long)strtol(&endstrbefore[1], &endstrafter, 10); + type = get_interval_type(endstrafter); + if (type == INTERVAL_INVALID) + { + condlog(0, "args latency_interval type is invalid"); + return 0; + } + + if (check_args_valid(*ionum, *interval, type) == 0) + { + return 0; + } + + ratio = get_conversion_ratio(type); + *interval *= (long long)ratio; + + return 1; +} + +long long calc_standard_deviation(long long *path_latency, int size, long long avglatency) +{ + int index; + long long total = 0; + + for (index = 0; index < size; index++) + { + total += (path_latency[index] - avglatency) * (path_latency[index] - avglatency); + } + + total /= (size-1); + + return (long long)sqrt((double)total); +} + +int getprio (struct path *pp, char *args, unsigned int timeout) +{ + int rc, temp; + int index = 0; + int io_num; + long long latency_interval; + long long avglatency; + long long standard_deviation; + long long toldelay = 0; + long long before, after; + struct timespec tv; + + if (pp->fd < 0) + return -1; + + if (get_interval_and_ionum(args, &io_num, &latency_interval) == 0) + { + condlog(0, "%s: get path_latency args fail", pp->dev); + return DEFAULT_PRIORITY; + } + + memset(path_latency, 0, sizeof(path_latency)); + + temp = io_num; + while (temp-- > 0) + { + (void)clock_gettime(CLOCK_MONOTONIC, &tv); + before = timeval_to_us(&tv); + + if (do_readsector0(pp->fd, timeout) == 2) + { + condlog(0, "%s: path down", pp->dev); + return -1; + } + + (void)clock_gettime(CLOCK_MONOTONIC, &tv); + after = timeval_to_us(&tv); + + path_latency[index] = after - before; + toldelay += path_latency[index++]; + } + + avglatency = toldelay/(long long)io_num; + condlog(4, "%s: average latency is (%lld)", pp->dev, avglatency); + + if (avglatency > THRES_USEC_VALUE) + { + condlog(0, "%s: average latency (%lld) is more than thresold", pp->dev, avglatency); + return DEFAULT_PRIORITY; + } + + /* warn the user if the latency_interval set is smaller than (2 * standard deviation), or equal */ + standard_deviation = calc_standard_deviation(path_latency, index, avglatency); + if (latency_interval <= (2 * standard_deviation)) + condlog(3, "%s: args latency_interval set is smaller than 2 * standard deviation (%lld us), or equal", + pp->dev, standard_deviation); + + rc = (int)(THRES_USEC_VALUE - (avglatency/(long long)latency_interval)); + return rc; +} diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5 index 5939688..1507406 100644 --- a/multipath/multipath.conf.5 +++ b/multipath/multipath.conf.5 @@ -293,6 +293,10 @@ Generate a random priority between 1 and 10. Generate the path priority based on the regular expression and the priority provided as argument. Requires prio_args keyword. .TP +.I path_latency +Generate the path priority based on a latency algorithm. +Requires prio_args keyword. +.TP .I datacore .\" XXX ???. Requires prio_args keyword. @@ -333,6 +337,22 @@ these values can be looked up through sysfs or by running \fImultipathd show pat "%N:%R:%n:%r"\fR. For example: 0x200100e08ba0aea0:0x210100e08ba0aea0:.*:.* , .*:.*:iqn.2009-10.com.redhat.msp.lab.ask-06:.* .RE .TP 12 +.I path_latency +Needs a value of the form +\fI"<latency_interval>|<io_num>"\fR +.RS +.TP 8 +.I latency_interval +The interval values of average latency between two different neighbour ranks of path priority, used to partition +different priority ranks. Form: XXs, or XXXus, or XXXms. Unit: Second, or Microsecond, or Millisecond. Valid Values: +Integer, using logarithmic scale and base number is 10, s [1, 10], ms [1, 10000], us [1, 10000000], For example: +If latency_interval=10ms, the paths will be grouped in priority groups with path latency 0-10ms, 10-20ms, 20-30ms, etc. +.TP +.I io_num +The number of read IOs sent to the current path continuously, used to calculate the average path latency. +Valid Values: Integer, [2, 200]. +.RE +.TP 12 .I alua If \fIexclusive_pref_bit\fR is set, paths with the \fIpreferred path\fR bit set will always be in their own path group. -- 2.6.4.windows.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v5 1/1] multipath-tools: Prioritizer based on a latency algorithm 2017-06-09 9:58 ` [PATCH v5 1/1] " Yang Feng @ 2017-06-14 20:47 ` Martin Wilck 2017-06-16 9:12 ` Yang Feng 0 siblings, 1 reply; 4+ messages in thread From: Martin Wilck @ 2017-06-14 20:47 UTC (permalink / raw) To: Yang Feng, hare, christophe.varoqui, bmarzins, xose.vazquez Cc: zouming.zouming, guanjunxiong, shenhong09, dm-devel, hege09, qiuxin Hello Yang, thanks for your work, and apologies for making you wait so long. I still have some suggestions. Please see below. On Fri, 2017-06-09 at 17:58 +0800, Yang Feng wrote > libmultipath/prioritizers: Prioritizer for device mapper multipath, > where the corresponding priority values of specific paths are > provided > by a latency algorithm. And the latency algorithm is dependent on the > following arguments(latency_interval and io_num). > The principle of the algorithm is illustrated as follows: > 1. By sending a certain number "cons_num" of read IOs to the current > path continuously, the IOs' average latency can be calculated. > 2. According to the average latency of each path and the weight value > "latency_interval", the priority "rc" of each path can be provided. > > latency_interval latency_interval latency_interval laten > cy_interval > |------------------|------------------|------------------|...|---- > --------------| > | priority rank 1 | priority rank 2 | priority rank 3 > |...| priority rank x | > |------------------|------------------|------------------|...|---- > --------------| > Priority Rank Partitioning > > Signed-off-by: Yang Feng <philip.yang@huawei.com> > Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com> > Reviewed-by: Martin Wilck <mwilck@suse.com> > Reviewed-by: Xose Vazquez Perez <xose.vazquez@gmail.com> > Reviewed-by: Hannes Reinecke <hare@suse.de> > > +#define MAX_LATENCY_INTERVAL 10 /*unit: s*/ > +#define MIN_LATENCY_INTERVAL 1 /*unit: us, or ms, or s*/ Why not use the same unit or both values? > +static inline long long timeval_to_us(const struct timespec *tv) > +{ > + return ((long long) tv->tv_sec * USEC_PER_SEC) + (tv- > >tv_nsec >> 10); Please calculate correctly, divide by 1000. > +int check_args_valid(int io_num, long long latency_interval, int > type) > +{ > + if ((io_num < MIN_IO_NUM) || (io_num > MAX_IO_NUM)) > + { > + condlog(0, "args io_num is more than the valid values > range"); > + return 0; > + } Please write "outside the valid range" rather than "more than the valid values range". In general, when you use condlog(), please prefix all messages with a string that clearly identifies the path_latency prioritizer, e.g. PRIO_PATH_LATENCY. I know we're not currently doing that consistently in multipath-tools, but that shouldn't be a reason for not doing when adding new code. > + > + /* check value is set by using logarithmic scale, and base > number is 10 */ > + if ((latency_interval % BASE_NUMBER) != 0) > + { > + condlog(0, "args latency_interval is not set by using > logarithmic scale , where base number is 10"); > + return 0; > + } This is not what I meant with "logarithmic scale". See below. > + > + /* s:[1, 10], ms:[1, 10000], us:[1, 10000000] */ > + if ((latency_interval < MIN_LATENCY_INTERVAL) > + || (latency_interval > (MAX_LATENCY_INTERVAL * USEC_PER_SEC > / conversion_ratio[type]))) > + { > + condlog(0, "args latency_interval is more than the valid > values range"); > + return 0; > + } Again, please use "outside". > +/* In multipath.conf, args form: io_num|latency_interval. For > example, > +* args is "20|10ms", this function can get 20, 10. > +*/ > +static int get_interval_and_ionum(char *args, > + int *ionum, > + long long *interval) > +{ > + char source[MAX_CHAR_SIZE]; > + char vertica = '|'; > + char *endstrbefore = NULL; > + char *endstrafter = NULL; > + int type; > + unsigned int size = strlen(args); > + long long ratio; > + > + if ((args == NULL) || (ionum == NULL) || (interval == NULL)) > + { > + condlog(0, "args string is NULL"); > + return 0; > + } > + > + if ((size < 1) || (size > MAX_CHAR_SIZE-1)) > + { > + condlog(0, "args string's size is too long"); > + return 0; > + } "too long" is not correct for the first (more likely) error condition. > + > + memcpy(source, args, size+1); > + > + if (!isdigit(source[0])) > + { > + condlog(0, "args io_num string's first char is not digit"); > + return 0; > + } It would be ok to catch all these errors with a single line of code such as "path_latency: invalid prio_args format: %s". > + > + *ionum = (int)strtoul(source, &endstrbefore, 10); > + if (endstrbefore[0] != vertica) > + { > + condlog(0, "segmentation char is invalid"); > + return 0; > + } > + > + if (!isdigit(endstrbefore[1])) > + { > + condlog(0, "args latency_interval string's first char is not > digit"); > + return 0; > + } > + > + *interval = (long long)strtol(&endstrbefore[1], &endstrafter, > 10); > + type = get_interval_type(endstrafter); > + if (type == INTERVAL_INVALID) > + { > + condlog(0, "args latency_interval type is invalid"); > + return 0; > + } > + > + if (check_args_valid(*ionum, *interval, type) == 0) > + { > + return 0; > + } > + > + ratio = get_conversion_ratio(type); > + *interval *= (long long)ratio; > + > + return 1; > +} > + > +long long calc_standard_deviation(long long *path_latency, int size, > long long avglatency) > +{ > + int index; > + long long total = 0; > + > + for (index = 0; index < size; index++) > + { > + total += (path_latency[index] - avglatency) * > (path_latency[index] - avglatency); > + } > + > + total /= (size-1); > + > + return (long long)sqrt((double)total); > +} > + > +int getprio (struct path *pp, char *args, unsigned int timeout) > +{ > + int rc, temp; > + int index = 0; > + int io_num; > + long long latency_interval; > + long long avglatency; > + long long standard_deviation; > + long long toldelay = 0; > + long long before, after; > + struct timespec tv; > + > + if (pp->fd < 0) > + return -1; > + > + if (get_interval_and_ionum(args, &io_num, &latency_interval) == > 0) > + { > + condlog(0, "%s: get path_latency args fail", pp->dev); > + return DEFAULT_PRIORITY; > + } Could you just assume reasonable defaults for the path latency algorithm rather than returning a constant? > + > + memset(path_latency, 0, sizeof(path_latency)); > + > + temp = io_num; > + while (temp-- > 0) > + { > + (void)clock_gettime(CLOCK_MONOTONIC, &tv); > + before = timeval_to_us(&tv); > + > + if (do_readsector0(pp->fd, timeout) == 2) > + { > + condlog(0, "%s: path down", pp->dev); > + return -1; > + } > + > + (void)clock_gettime(CLOCK_MONOTONIC, &tv); > + after = timeval_to_us(&tv); > + > + path_latency[index] = after - before; > + toldelay += path_latency[index++]; > + } > + > + avglatency = toldelay/(long long)io_num; > + condlog(4, "%s: average latency is (%lld)", pp->dev, > avglatency); > + > + if (avglatency > THRES_USEC_VALUE) > + { > + condlog(0, "%s: average latency (%lld) is more than > thresold", pp->dev, avglatency); > + return DEFAULT_PRIORITY; > + } Why don't you put simply put this in the highest latency bin? See below. > + > + /* warn the user if the latency_interval set is smaller than (2 > * standard deviation), or equal */ > + standard_deviation = calc_standard_deviation(path_latency, > index, avglatency); > + if (latency_interval <= (2 * standard_deviation)) > + condlog(3, "%s: args latency_interval set is smaller than 2 > * standard deviation (%lld us), or equal", > + pp->dev, standard_deviation); > + > + rc = (int)(THRES_USEC_VALUE - (avglatency/(long > long)latency_interval)); > + return rc; > +} Hm, I think you misunderstood my "logarithmic scale" idea. By dividing the average by the interval size, you get ordinary linear scaling again. That doesn't help the problem. Here is a sample program illustrating what I meant. It uses math.h functions, but that shouldn't be a problem; we're running in user space. The chosen interval size, corresponding to a factor of 1.995, is probably a good choice for many scenarios. Your could represent the range between 1us and 10s with 34 prio values, and still be able to differentiate paths with, say, 1.5 and 3 us latency. If you use this, you should also use double type for calculating the average and stddev, and you must be careful to calculate the standard deviation for the logarithms, as interval size is not constant any more (only on logarithmic scale). #include <stdio.h> #include <math.h> /* * For a given positive value x, return a "prio" between 0 and n * using logarithmically scaled intervals between minval and maxval. */ int log_prio(double x, unsigned long minval, unsigned long maxval, unsigned n) { double lx, lmin, lmax; if (x <= minval) return n; if (x >= maxval) return 0; lx = log10(x); lmin = log10(minval); lmax = log10(maxval); return n - (n - 1) * (lx - lmin) / (lmax - lmin); } int main(void) { int minval = 1; int maxval = 1000; int n = 11; double x; for (x = minval/2.; x < maxval*2.; x *= sqrt(2.)) printf("%f: %d\n", x, log_prio(x, minval, maxval, n)); return 0; } Regards, Martin -- Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107 SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v5 1/1] multipath-tools: Prioritizer based on a latency algorithm 2017-06-14 20:47 ` Martin Wilck @ 2017-06-16 9:12 ` Yang Feng 0 siblings, 0 replies; 4+ messages in thread From: Yang Feng @ 2017-06-16 9:12 UTC (permalink / raw) To: Martin Wilck, hare, christophe.varoqui, bmarzins, xose.vazquez Cc: zouming.zouming, guanjunxiong, shenhong09, dm-devel, hege09, qiuxin Hi Martin, Thanks for your reviews. Please find my replys. And the patch v6 will be sent later. Regards, Yang --- > >> +#define MAX_LATENCY_INTERVAL 10 /*unit: s*/ >> +#define MIN_LATENCY_INTERVAL 1 /*unit: us, or ms, or s*/ > > Why not use the same unit or both values? > It's easy to check the valid range of the "latency_interval" value. For unit: us, or ms, or s, the min digital value is the same, is 1. And for all unit, the max conversion value is the same, is 10 seconds. >> +static inline long long timeval_to_us(const struct timespec *tv) >> +{ >> + return ((long long) tv->tv_sec * USEC_PER_SEC) + (tv- >>> tv_nsec >> 10); > > Please calculate correctly, divide by 1000. > Thanks, it will be fixed. >> +int check_args_valid(int io_num, long long latency_interval, int >> type) >> +{ >> + if ((io_num < MIN_IO_NUM) || (io_num > MAX_IO_NUM)) >> + { >> + condlog(0, "args io_num is more than the valid values >> range"); >> + return 0; >> + } > > Please write "outside the valid range" rather than "more than the valid > values range". > Thanks, it will be fixed. > In general, when you use condlog(), please prefix all messages with a > string that clearly identifies the path_latency prioritizer, e.g. > PRIO_PATH_LATENCY. > I know we're not currently doing that consistently in multipath-tools, > but that shouldn't be a reason for not doing when adding new code. > Good, thanks, it will be fixed. >> + >> + /* check value is set by using logarithmic scale, and base >> number is 10 */ >> + if ((latency_interval % BASE_NUMBER) != 0) >> + { >> + condlog(0, "args latency_interval is not set by using >> logarithmic scale , where base number is 10"); >> + return 0; >> + } > > This is not what I meant with "logarithmic scale". See below. > >> + >> + /* s:[1, 10], ms:[1, 10000], us:[1, 10000000] */ >> + if ((latency_interval < MIN_LATENCY_INTERVAL) >> + || (latency_interval > (MAX_LATENCY_INTERVAL * USEC_PER_SEC >> / conversion_ratio[type]))) >> + { >> + condlog(0, "args latency_interval is more than the valid >> values range"); >> + return 0; >> + } > > Again, please use "outside". > Thanks, it will be fixed. >> +/* In multipath.conf, args form: io_num|latency_interval. For >> example, >> +* args is "20|10ms", this function can get 20, 10. >> +*/ >> +static int get_interval_and_ionum(char *args, >> + int *ionum, >> + long long *interval) >> +{ >> + char source[MAX_CHAR_SIZE]; >> + char vertica = '|'; >> + char *endstrbefore = NULL; >> + char *endstrafter = NULL; >> + int type; >> + unsigned int size = strlen(args); >> + long long ratio; >> + >> + if ((args == NULL) || (ionum == NULL) || (interval == NULL)) >> + { >> + condlog(0, "args string is NULL"); >> + return 0; >> + } >> + >> + if ((size < 1) || (size > MAX_CHAR_SIZE-1)) >> + { >> + condlog(0, "args string's size is too long"); >> + return 0; >> + } > > "too long" is not correct for the first (more likely) error condition. > >> + >> + memcpy(source, args, size+1); >> + >> + if (!isdigit(source[0])) >> + { >> + condlog(0, "args io_num string's first char is not digit"); >> + return 0; >> + } > > It would be ok to catch all these errors with a single line of code > such as "path_latency: invalid prio_args format: %s". > OK, it will be fixed. >> + >> + *ionum = (int)strtoul(source, &endstrbefore, 10); >> + if (endstrbefore[0] != vertica) >> + { >> + condlog(0, "segmentation char is invalid"); >> + return 0; >> + } >> + >> + if (!isdigit(endstrbefore[1])) >> + { >> + condlog(0, "args latency_interval string's first char is not >> digit"); >> + return 0; >> + } >> + >> + *interval = (long long)strtol(&endstrbefore[1], &endstrafter, >> 10); >> + type = get_interval_type(endstrafter); >> + if (type == INTERVAL_INVALID) >> + { >> + condlog(0, "args latency_interval type is invalid"); >> + return 0; >> + } >> + >> + if (check_args_valid(*ionum, *interval, type) == 0) >> + { >> + return 0; >> + } >> + >> + ratio = get_conversion_ratio(type); >> + *interval *= (long long)ratio; >> + >> + return 1; >> +} >> + >> +long long calc_standard_deviation(long long *path_latency, int size, >> long long avglatency) >> +{ >> + int index; >> + long long total = 0; >> + >> + for (index = 0; index < size; index++) >> + { >> + total += (path_latency[index] - avglatency) * >> (path_latency[index] - avglatency); >> + } >> + >> + total /= (size-1); >> + >> + return (long long)sqrt((double)total); >> +} >> + >> +int getprio (struct path *pp, char *args, unsigned int timeout) >> +{ >> + int rc, temp; >> + int index = 0; >> + int io_num; >> + long long latency_interval; >> + long long avglatency; >> + long long standard_deviation; >> + long long toldelay = 0; >> + long long before, after; >> + struct timespec tv; >> + >> + if (pp->fd < 0) >> + return -1; >> + >> + if (get_interval_and_ionum(args, &io_num, &latency_interval) == >> 0) >> + { >> + condlog(0, "%s: get path_latency args fail", pp->dev); >> + return DEFAULT_PRIORITY; >> + } > > Could you just assume reasonable defaults for the path latency > algorithm rather than returning a constant? > En, I think returning a DEFAULT_PRIORITY "0" is no problem. If return a assumed value, then the user can be confused, because the defaults is not set. >> + >> + memset(path_latency, 0, sizeof(path_latency)); >> + >> + temp = io_num; >> + while (temp-- > 0) >> + { >> + (void)clock_gettime(CLOCK_MONOTONIC, &tv); >> + before = timeval_to_us(&tv); >> + >> + if (do_readsector0(pp->fd, timeout) == 2) >> + { >> + condlog(0, "%s: path down", pp->dev); >> + return -1; >> + } >> + >> + (void)clock_gettime(CLOCK_MONOTONIC, &tv); >> + after = timeval_to_us(&tv); >> + >> + path_latency[index] = after - before; >> + toldelay += path_latency[index++]; >> + } >> + >> + avglatency = toldelay/(long long)io_num; >> + condlog(4, "%s: average latency is (%lld)", pp->dev, >> avglatency); >> + >> + if (avglatency > THRES_USEC_VALUE) >> + { >> + condlog(0, "%s: average latency (%lld) is more than >> thresold", pp->dev, avglatency); >> + return DEFAULT_PRIORITY; >> + } > > Why don't you put simply put this in the highest latency bin? See > below. > Thanks, it will be fixed. >> + >> + /* warn the user if the latency_interval set is smaller than (2 >> * standard deviation), or equal */ >> + standard_deviation = calc_standard_deviation(path_latency, >> index, avglatency); >> + if (latency_interval <= (2 * standard_deviation)) >> + condlog(3, "%s: args latency_interval set is smaller than 2 >> * standard deviation (%lld us), or equal", >> + pp->dev, standard_deviation); >> + >> + rc = (int)(THRES_USEC_VALUE - (avglatency/(long >> long)latency_interval)); >> + return rc; >> +} > > Hm, I think you misunderstood my "logarithmic scale" idea. > By dividing the average by the interval size, you get ordinary linear > scaling again. That doesn't help the problem. > > Here is a sample program illustrating what I meant. It uses math.h > functions, but that shouldn't be a problem; we're running in user > space. The chosen interval size, corresponding to a factor of 1.995, is > probably a good choice for many scenarios. Your could represent the > range between 1us and 10s with 34 prio values, and still be able to > differentiate paths with, say, 1.5 and 3 us latency. > > If you use this, you should also use double type for calculating the > average and stddev, and you must be careful to calculate the standard > deviation for the logarithms, as interval size is not constant any more > (only on logarithmic scale). > > #include <stdio.h> > #include <math.h> > > /* > * For a given positive value x, return a "prio" between 0 and n > * using logarithmically scaled intervals between minval and maxval. > */ > > int log_prio(double x, > unsigned long minval, unsigned long maxval, unsigned n) > { > double lx, lmin, lmax; > if (x <= minval) > return n; > if (x >= maxval) > return 0; > lx = log10(x); > lmin = log10(minval); > lmax = log10(maxval); > return n - (n - 1) * (lx - lmin) / (lmax - lmin); > } > > int main(void) > { > int minval = 1; > int maxval = 1000; > int n = 11; > double x; > for (x = minval/2.; x < maxval*2.; x *= sqrt(2.)) > printf("%f: %d\n", x, log_prio(x, minval, maxval, n)); > return 0; > } > OK, the argument "latency_interval" will be delete, instead, the argument "base_num" will be added for logarithmic scale. And min value will be set 1 us, max value will be set 10000000 us (or 10 s). > Regards, > Martin > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-06-16 9:12 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-06-09 9:58 [PATCH v5 0/1] multipath-tools: Prioritizer based on a latency algorithm Yang Feng 2017-06-09 9:58 ` [PATCH v5 1/1] " Yang Feng 2017-06-14 20:47 ` Martin Wilck 2017-06-16 9:12 ` Yang Feng
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.