* [RFC PATCH 0/4] dm-mpath: dynamic load balancers
@ 2008-09-12 14:50 Kiyoshi Ueda
2008-09-12 14:52 ` [RFC PATCH 1/4] dm-mpath: add a path selector interface Kiyoshi Ueda
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Kiyoshi Ueda @ 2008-09-12 14:50 UTC (permalink / raw)
To: dm-devel, linux-scsi; +Cc: j-nomura, k-ueda, stefan.bader
Hi,
The following patches add the following 2 dynamic load balancers
to request-based dm-multipath:
o queue-length oriented dynamic load balancer, dm-queue-length.
o service-time oriented dynamic load balancer, dm-service-time.
This patch-set is not ready for inclusion to the upstream kernel,
but I'm posting now to hear comments from multipath related people.
Any comments are welcome.
This patch-set can be applied on top of 2.6.27-rc6 + Alasdair's
linux-next patches below + request-based dm patches(*).
dm-mpath-use-more-error-codes.patch
dm-mpath-remove-is_active-from-struct-dm_path.patch
(*) http://lkml.org/lkml/2008/9/12/100
Summary of the patch-set:
1/4: dm-mpath: add a path-selector interface
2/4: dm-mpath: add queue-length oriented dynamic load balancer (dlb)
3/4: dm-mpath: interface change for service-time oriented dlb
4/4: dm-mpath: add service-time oriented dynamic load balancer (dlb)
Thanks,
Kiyoshi Ueda
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 1/4] dm-mpath: add a path selector interface
2008-09-12 14:50 [RFC PATCH 0/4] dm-mpath: dynamic load balancers Kiyoshi Ueda
@ 2008-09-12 14:52 ` Kiyoshi Ueda
2008-09-12 14:52 ` [RFC PATCH 2/4] dm-mpath: add queue-length oriented dynamic load balancer Kiyoshi Ueda
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Kiyoshi Ueda @ 2008-09-12 14:52 UTC (permalink / raw)
To: dm-devel, linux-scsi; +Cc: j-nomura, k-ueda, stefan.bader
This patch adds a new hook for dm path selector: start_io.
Target drivers should call this hook before submitting I/O to
the selected path.
Path selectors can use it to start accounting of the I/O.
(e.g. counting the number of in-flight I/Os.)
The code is based on the patch posted by Stefan Bader:
https://www.redhat.com/archives/dm-devel/2005-October/msg00050.html
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
---
drivers/md/dm-mpath.c | 3 +++
drivers/md/dm-path-selector.h | 1 +
2 files changed, 4 insertions(+)
Index: 2.6.27-rc6/drivers/md/dm-mpath.c
===================================================================
--- 2.6.27-rc6.orig/drivers/md/dm-mpath.c
+++ 2.6.27-rc6/drivers/md/dm-mpath.c
@@ -343,6 +343,9 @@ static int map_io(struct multipath *m, s
mpio->pgpath = pgpath;
+ if (r == DM_MAPIO_REMAPPED && pgpath->pg->ps.type->start_io)
+ pgpath->pg->ps.type->start_io(&pgpath->pg->ps, &pgpath->path);
+
spin_unlock_irqrestore(&m->lock, flags);
return r;
Index: 2.6.27-rc6/drivers/md/dm-path-selector.h
===================================================================
--- 2.6.27-rc6.orig/drivers/md/dm-path-selector.h
+++ 2.6.27-rc6/drivers/md/dm-path-selector.h
@@ -75,6 +75,7 @@ struct path_selector_type {
int (*status) (struct path_selector *ps, struct dm_path *path,
status_type_t type, char *result, unsigned int maxlen);
+ int (*start_io) (struct path_selector *ps, struct dm_path *path);
int (*end_io) (struct path_selector *ps, struct dm_path *path);
};
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 2/4] dm-mpath: add queue-length oriented dynamic load balancer
2008-09-12 14:50 [RFC PATCH 0/4] dm-mpath: dynamic load balancers Kiyoshi Ueda
2008-09-12 14:52 ` [RFC PATCH 1/4] dm-mpath: add a path selector interface Kiyoshi Ueda
@ 2008-09-12 14:52 ` Kiyoshi Ueda
2008-09-12 14:53 ` [RFC PATCH 3/4] dm-mpath: interface change for service-time " Kiyoshi Ueda
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Kiyoshi Ueda @ 2008-09-12 14:52 UTC (permalink / raw)
To: dm-devel, linux-scsi; +Cc: j-nomura, k-ueda, stefan.bader
This patch adds a dynamic load balancer, dm-queue-length, which
balances the number of in-flight I/Os.
The code is based on the patch posted by Stefan Bader:
https://www.redhat.com/archives/dm-devel/2005-October/msg00050.html
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
---
drivers/md/Makefile | 3
drivers/md/dm-queue-length.c | 257 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 259 insertions(+), 1 deletion(-)
Index: 2.6.27-rc6/drivers/md/dm-queue-length.c
===================================================================
--- /dev/null
+++ 2.6.27-rc6/drivers/md/dm-queue-length.c
@@ -0,0 +1,257 @@
+/*
+ * Copyright (C) 2004-2005 IBM Corp. All Rights Reserved.
+ * Copyright (C) 2006-2008 NEC Corporation.
+ *
+ * dm-queue-length.c
+ *
+ * Module Author: Stefan Bader, IBM
+ * Modified by: Kiyoshi Ueda, NEC
+ *
+ * This file is released under the GPL.
+ *
+ * Load balancing path selector.
+ */
+
+#include "dm.h"
+#include "dm-path-selector.h"
+
+#include <linux/slab.h>
+#include <linux/ctype.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+#include <asm/atomic.h>
+
+#define DM_MSG_PREFIX "multipath queue-length"
+#define QL_MIN_IO 128
+#define QL_VERSION "0.1.0"
+
+struct selector {
+ struct list_head valid_paths;
+ struct list_head failed_paths;
+};
+
+struct path_info {
+ struct list_head list;
+ struct dm_path *path;
+ unsigned int repeat_count;
+ atomic_t qlen;
+};
+
+static struct selector *alloc_selector(void)
+{
+ struct selector *s = kzalloc(sizeof(*s), GFP_KERNEL);
+
+ if (s) {
+ INIT_LIST_HEAD(&s->valid_paths);
+ INIT_LIST_HEAD(&s->failed_paths);
+ }
+
+ return s;
+}
+
+static int ql_create(struct path_selector *ps, unsigned argc, char **argv)
+{
+ struct selector *s = alloc_selector();
+
+ if (!s)
+ return -ENOMEM;
+
+ ps->context = s;
+
+ return 0;
+}
+
+static void ql_free_paths(struct list_head *paths)
+{
+ struct path_info *cpi, *npi;
+
+ list_for_each_entry_safe(cpi, npi, paths, list) {
+ list_del(&cpi->list);
+ cpi->path->pscontext = NULL;
+ kfree(cpi);
+ }
+}
+
+static void ql_destroy(struct path_selector *ps)
+{
+ struct selector *s = (struct selector *) ps->context;
+
+ ql_free_paths(&s->valid_paths);
+ ql_free_paths(&s->failed_paths);
+ kfree(s);
+ ps->context = NULL;
+}
+
+static int ql_add_path(struct path_selector *ps, struct dm_path *path,
+ int argc, char **argv, char **error)
+{
+ struct selector *s = (struct selector *) ps->context;
+ struct path_info *pi;
+ unsigned int repeat_count = QL_MIN_IO;
+
+ /* Parse the arguments */
+ if (argc > 1) {
+ *error = "queue-length ps: incorrect number of arguments";
+ return -EINVAL;
+ }
+
+ /* First path argument is number of I/Os before switching path. */
+ if ((argc == 1) && (sscanf(argv[0], "%u", &repeat_count) != 1)) {
+ *error = "queue-length ps: invalid repeat count";
+ return -EINVAL;
+ }
+
+ /* Allocate the path information structure */
+ pi = kmalloc(sizeof(*pi), GFP_KERNEL);
+ if (!pi) {
+ *error = "queue-length ps: Error allocating path information";
+ return -ENOMEM;
+ }
+
+ pi->path = path;
+ pi->repeat_count = repeat_count;
+ atomic_set(&pi->qlen, 0);
+ path->pscontext = pi;
+
+ list_add_tail(&pi->list, &s->valid_paths);
+
+ return 0;
+}
+
+static void ql_fail_path(struct path_selector *ps, struct dm_path *path)
+{
+ struct selector *s = (struct selector *) ps->context;
+ struct path_info *pi = path->pscontext;
+
+ list_move(&pi->list, &s->failed_paths);
+}
+
+static int ql_reinstate_path(struct path_selector *ps, struct dm_path *path)
+{
+ struct selector *s = (struct selector *) ps->context;
+ struct path_info *pi = path->pscontext;
+
+ list_move_tail(&pi->list, &s->valid_paths);
+
+ return 0;
+}
+
+static inline int ql_compare_qlen(struct path_info *pi1, struct path_info *pi2)
+{
+ return atomic_read(&pi1->qlen) - atomic_read(&pi2->qlen);
+}
+
+static struct dm_path *ql_select_path(struct path_selector *ps,
+ unsigned *repeat_count)
+{
+ struct selector *s = (struct selector *) ps->context;
+ struct path_info *cpi = NULL, *spi = NULL;
+
+ if (list_empty(&s->valid_paths))
+ return NULL;
+
+ /* Change preferred (first in list) path to evenly balance. */
+ list_move_tail(s->valid_paths.next, &s->valid_paths);
+
+ list_for_each_entry(cpi, &s->valid_paths, list) {
+ if (!spi)
+ spi = cpi;
+ else if (ql_compare_qlen(cpi, spi) < 0)
+ spi = cpi;
+ }
+
+ if (spi)
+ *repeat_count = spi->repeat_count;
+
+ return spi ? spi->path : NULL;
+}
+
+static int ql_start_io(struct path_selector *ps, struct dm_path *path)
+{
+ struct path_info *pi = path->pscontext;
+
+ atomic_inc(&pi->qlen);
+
+ return 0;
+}
+
+static int ql_end_io(struct path_selector *ps, struct dm_path *path)
+{
+ struct path_info *pi = path->pscontext;
+
+ atomic_dec(&pi->qlen);
+
+ return 0;
+}
+
+static int ql_status(struct path_selector *ps, struct dm_path *path,
+ status_type_t type, char *result, unsigned int maxlen)
+{
+ int sz = 0;
+ struct path_info *pi;
+
+ /* When called with (path == NULL), return selector status/args. */
+ if (!path)
+ DMEMIT("0 ");
+ else {
+ pi = path->pscontext;
+
+ switch (type) {
+ case STATUSTYPE_INFO:
+ DMEMIT("%u ", atomic_read(&pi->qlen));
+ break;
+ case STATUSTYPE_TABLE:
+ DMEMIT("%u ", pi->repeat_count);
+ break;
+ }
+ }
+
+ return sz;
+}
+
+static struct path_selector_type ql_ps = {
+ .name = "queue-length",
+ .module = THIS_MODULE,
+ .table_args = 1,
+ .info_args = 1,
+ .create = ql_create,
+ .destroy = ql_destroy,
+ .status = ql_status,
+ .add_path = ql_add_path,
+ .fail_path = ql_fail_path,
+ .reinstate_path = ql_reinstate_path,
+ .select_path = ql_select_path,
+ .start_io = ql_start_io,
+ .end_io = ql_end_io,
+};
+
+static int __init dm_ql_init(void)
+{
+ int r = dm_register_path_selector(&ql_ps);
+
+ if (r < 0)
+ DMERR("register failed %d", r);
+
+ DMINFO("version " QL_VERSION " loaded");
+
+ return r;
+}
+
+static void __exit dm_ql_exit(void)
+{
+ int r = dm_unregister_path_selector(&ql_ps);
+
+ if (r < 0)
+ DMERR("unregister failed %d", r);
+}
+
+module_init(dm_ql_init);
+module_exit(dm_ql_exit);
+
+MODULE_AUTHOR("Stefan Bader <Stefan.Bader at de.ibm.com>");
+MODULE_DESCRIPTION(
+ "(C) Copyright IBM Corp. 2004,2005 All Rights Reserved.\n"
+ DM_NAME " load balancing path selector (dm-queue-length.c version "
+ QL_VERSION ")"
+);
+MODULE_LICENSE("GPL");
Index: 2.6.27-rc6/drivers/md/Makefile
===================================================================
--- 2.6.27-rc6.orig/drivers/md/Makefile
+++ 2.6.27-rc6/drivers/md/Makefile
@@ -32,7 +32,8 @@ obj-$(CONFIG_BLK_DEV_MD) += md-mod.o
obj-$(CONFIG_BLK_DEV_DM) += dm-mod.o
obj-$(CONFIG_DM_CRYPT) += dm-crypt.o
obj-$(CONFIG_DM_DELAY) += dm-delay.o
-obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o
+obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o \
+ dm-queue-length.o
obj-$(CONFIG_DM_SNAPSHOT) += dm-snapshot.o
obj-$(CONFIG_DM_MIRROR) += dm-mirror.o dm-log.o
obj-$(CONFIG_DM_ZERO) += dm-zero.o
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 3/4] dm-mpath: interface change for service-time oriented dynamic load balancer
2008-09-12 14:50 [RFC PATCH 0/4] dm-mpath: dynamic load balancers Kiyoshi Ueda
2008-09-12 14:52 ` [RFC PATCH 1/4] dm-mpath: add a path selector interface Kiyoshi Ueda
2008-09-12 14:52 ` [RFC PATCH 2/4] dm-mpath: add queue-length oriented dynamic load balancer Kiyoshi Ueda
@ 2008-09-12 14:53 ` Kiyoshi Ueda
2008-09-12 14:53 ` [RFC PATCH 4/4] dm-mpath: add " Kiyoshi Ueda
2009-01-28 15:37 ` [RFC PATCH 0/4] dm-mpath: dynamic load balancers Alasdair G Kergon
4 siblings, 0 replies; 9+ messages in thread
From: Kiyoshi Ueda @ 2008-09-12 14:53 UTC (permalink / raw)
To: dm-devel, linux-scsi; +Cc: j-nomura, k-ueda, stefan.bader
This patch changes path selector interfaces for service-time oriented
dynamic load balancer.
To calculate the service time for an incoming I/O correctly,
the load balancer needs the size of the incoming I/O when selecting
the next path.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
---
drivers/md/dm-mpath.c | 27 ++++++++++++++++-----------
drivers/md/dm-path-selector.h | 9 ++++++---
drivers/md/dm-queue-length.c | 8 +++++---
drivers/md/dm-round-robin.c | 2 +-
4 files changed, 28 insertions(+), 18 deletions(-)
Index: 2.6.27-rc6/drivers/md/dm-mpath.c
===================================================================
--- 2.6.27-rc6.orig/drivers/md/dm-mpath.c
+++ 2.6.27-rc6/drivers/md/dm-mpath.c
@@ -97,6 +97,7 @@ struct multipath {
*/
struct dm_mpath_io {
struct pgpath *pgpath;
+ size_t nr_bytes;
};
typedef int (*action_fn) (struct pgpath *pgpath);
@@ -230,11 +231,12 @@ static void __switch_pg(struct multipath
m->pg_init_count = 0;
}
-static int __choose_path_in_pg(struct multipath *m, struct priority_group *pg)
+static int __choose_path_in_pg(struct multipath *m, struct priority_group *pg,
+ size_t nr_bytes)
{
struct dm_path *path;
- path = pg->ps.type->select_path(&pg->ps, &m->repeat_count);
+ path = pg->ps.type->select_path(&pg->ps, &m->repeat_count, nr_bytes);
if (!path)
return -ENXIO;
@@ -246,7 +248,7 @@ static int __choose_path_in_pg(struct mu
return 0;
}
-static void __choose_pgpath(struct multipath *m)
+static void __choose_pgpath(struct multipath *m, size_t nr_bytes)
{
struct priority_group *pg;
unsigned bypassed = 1;
@@ -258,12 +260,12 @@ static void __choose_pgpath(struct multi
if (m->next_pg) {
pg = m->next_pg;
m->next_pg = NULL;
- if (!__choose_path_in_pg(m, pg))
+ if (!__choose_path_in_pg(m, pg, nr_bytes))
return;
}
/* Don't change PG until it has no remaining paths */
- if (m->current_pg && !__choose_path_in_pg(m, m->current_pg))
+ if (m->current_pg && !__choose_path_in_pg(m, m->current_pg, nr_bytes))
return;
/*
@@ -275,7 +277,7 @@ static void __choose_pgpath(struct multi
list_for_each_entry(pg, &m->priority_groups, list) {
if (pg->bypassed == bypassed)
continue;
- if (!__choose_path_in_pg(m, pg))
+ if (!__choose_path_in_pg(m, pg, nr_bytes))
return;
}
} while (bypassed--);
@@ -306,6 +308,7 @@ static int map_io(struct multipath *m, s
struct dm_mpath_io *mpio, unsigned was_queued)
{
int r = DM_MAPIO_REMAPPED;
+ size_t nr_bytes = blk_rq_bytes(clone);
unsigned long flags;
struct pgpath *pgpath;
struct block_device *bdev;
@@ -315,7 +318,7 @@ static int map_io(struct multipath *m, s
/* Do we need to select a new pgpath? */
if (!m->current_pgpath ||
(!m->queue_io && (m->repeat_count && --m->repeat_count == 0)))
- __choose_pgpath(m);
+ __choose_pgpath(m, nr_bytes);
pgpath = m->current_pgpath;
@@ -342,9 +345,11 @@ static int map_io(struct multipath *m, s
r = -EIO; /* Failed */
mpio->pgpath = pgpath;
+ mpio->nr_bytes = nr_bytes;
if (r == DM_MAPIO_REMAPPED && pgpath->pg->ps.type->start_io)
- pgpath->pg->ps.type->start_io(&pgpath->pg->ps, &pgpath->path);
+ pgpath->pg->ps.type->start_io(&pgpath->pg->ps, &pgpath->path,
+ nr_bytes);
spin_unlock_irqrestore(&m->lock, flags);
@@ -424,7 +429,7 @@ static void process_queued_ios(struct wo
goto out;
if (!m->current_pgpath)
- __choose_pgpath(m);
+ __choose_pgpath(m, 1 << 19); /* Assume 512 KB */
pgpath = m->current_pgpath;
@@ -1160,7 +1165,7 @@ static int multipath_end_io(struct dm_ta
if (pgpath) {
ps = &pgpath->pg->ps;
if (ps->type->end_io)
- ps->type->end_io(ps, &pgpath->path);
+ ps->type->end_io(ps, &pgpath->path, mpio->nr_bytes);
}
mempool_free(mpio, m->mpio_pool);
@@ -1379,7 +1384,7 @@ static int multipath_ioctl(struct dm_tar
spin_lock_irqsave(&m->lock, flags);
if (!m->current_pgpath)
- __choose_pgpath(m);
+ __choose_pgpath(m, 1 << 19); /* Assume 512KB */
if (m->current_pgpath) {
bdev = m->current_pgpath->path.dev->bdev;
Index: 2.6.27-rc6/drivers/md/dm-path-selector.h
===================================================================
--- 2.6.27-rc6.orig/drivers/md/dm-path-selector.h
+++ 2.6.27-rc6/drivers/md/dm-path-selector.h
@@ -56,7 +56,8 @@ struct path_selector_type {
* the path fails.
*/
struct dm_path *(*select_path) (struct path_selector *ps,
- unsigned *repeat_count);
+ unsigned *repeat_count,
+ size_t nr_bytes);
/*
* Notify the selector that a path has failed.
@@ -75,8 +76,10 @@ struct path_selector_type {
int (*status) (struct path_selector *ps, struct dm_path *path,
status_type_t type, char *result, unsigned int maxlen);
- int (*start_io) (struct path_selector *ps, struct dm_path *path);
- int (*end_io) (struct path_selector *ps, struct dm_path *path);
+ int (*start_io) (struct path_selector *ps, struct dm_path *path,
+ size_t nr_bytes);
+ int (*end_io) (struct path_selector *ps, struct dm_path *path,
+ size_t nr_bytes);
};
/* Register a path selector */
Index: 2.6.27-rc6/drivers/md/dm-round-robin.c
===================================================================
--- 2.6.27-rc6.orig/drivers/md/dm-round-robin.c
+++ 2.6.27-rc6/drivers/md/dm-round-robin.c
@@ -160,7 +160,7 @@ static int rr_reinstate_path(struct path
}
static struct dm_path *rr_select_path(struct path_selector *ps,
- unsigned *repeat_count)
+ unsigned *repeat_count, size_t nr_bytes)
{
struct selector *s = (struct selector *) ps->context;
struct path_info *pi = NULL;
Index: 2.6.27-rc6/drivers/md/dm-queue-length.c
===================================================================
--- 2.6.27-rc6.orig/drivers/md/dm-queue-length.c
+++ 2.6.27-rc6/drivers/md/dm-queue-length.c
@@ -142,7 +142,7 @@ static inline int ql_compare_qlen(struct
}
static struct dm_path *ql_select_path(struct path_selector *ps,
- unsigned *repeat_count)
+ unsigned *repeat_count, size_t nr_bytes)
{
struct selector *s = (struct selector *) ps->context;
struct path_info *cpi = NULL, *spi = NULL;
@@ -166,7 +166,8 @@ static struct dm_path *ql_select_path(st
return spi ? spi->path : NULL;
}
-static int ql_start_io(struct path_selector *ps, struct dm_path *path)
+static int ql_start_io(struct path_selector *ps, struct dm_path *path,
+ size_t nr_bytes)
{
struct path_info *pi = path->pscontext;
@@ -175,7 +176,8 @@ static int ql_start_io(struct path_selec
return 0;
}
-static int ql_end_io(struct path_selector *ps, struct dm_path *path)
+static int ql_end_io(struct path_selector *ps, struct dm_path *path,
+ size_t nr_bytes)
{
struct path_info *pi = path->pscontext;
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 4/4] dm-mpath: add service-time oriented dynamic load balancer
2008-09-12 14:50 [RFC PATCH 0/4] dm-mpath: dynamic load balancers Kiyoshi Ueda
` (2 preceding siblings ...)
2008-09-12 14:53 ` [RFC PATCH 3/4] dm-mpath: interface change for service-time " Kiyoshi Ueda
@ 2008-09-12 14:53 ` Kiyoshi Ueda
2009-01-28 15:37 ` [RFC PATCH 0/4] dm-mpath: dynamic load balancers Alasdair G Kergon
4 siblings, 0 replies; 9+ messages in thread
From: Kiyoshi Ueda @ 2008-09-12 14:53 UTC (permalink / raw)
To: dm-devel, linux-scsi; +Cc: j-nomura, k-ueda, stefan.bader
This patch adds a service time oriented dynamic load balancer,
dm-service-time.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
---
drivers/md/Makefile | 2
drivers/md/dm-service-time.c | 312 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 313 insertions(+), 1 deletion(-)
Index: 2.6.27-rc6/drivers/md/dm-service-time.c
===================================================================
--- /dev/null
+++ 2.6.27-rc6/drivers/md/dm-service-time.c
@@ -0,0 +1,312 @@
+/*
+ * Copyright (C) 2007-2008 NEC Corporation. All Rights Reserved.
+ *
+ * Module Author: Kiyoshi Ueda
+ *
+ * This file is released under the GPL.
+ *
+ * Throughput oriented path selector.
+ */
+
+#include "dm.h"
+#include "dm-path-selector.h"
+
+#define DM_MSG_PREFIX "multipath service-time"
+#define ST_MIN_IO 2
+#define ST_VERSION "0.1.0"
+
+struct selector {
+ struct list_head valid_paths;
+ struct list_head failed_paths;
+};
+
+struct path_info {
+ struct list_head list;
+ struct dm_path *path;
+ unsigned int repeat_count;
+
+ atomic_t in_flight; /* Total size of in-flight I/Os */
+ size_t perf; /* Recent performance of the path */
+ sector_t last_sectors; /* Total sectors of the last disk_stat_read */
+ size_t last_io_ticks; /* io_ticks of the last disk_stat_read */
+};
+
+static struct selector *alloc_selector(void)
+{
+ struct selector *s = kzalloc(sizeof(*s), GFP_KERNEL);
+
+ if (s) {
+ INIT_LIST_HEAD(&s->valid_paths);
+ INIT_LIST_HEAD(&s->failed_paths);
+ }
+
+ return s;
+}
+
+static int st_create(struct path_selector *ps, unsigned argc, char **argv)
+{
+ struct selector *s = alloc_selector();
+
+ if (!s)
+ return -ENOMEM;
+
+ ps->context = s;
+ return 0;
+}
+
+static void free_paths(struct list_head *paths)
+{
+ struct path_info *pi, *next;
+
+ list_for_each_entry_safe(pi, next, paths, list) {
+ list_del(&pi->list);
+ pi->path->pscontext = NULL;
+ kfree(pi);
+ }
+}
+
+static void st_destroy(struct path_selector *ps)
+{
+ struct selector *s = (struct selector *) ps->context;
+
+ free_paths(&s->valid_paths);
+ free_paths(&s->failed_paths);
+ kfree(s);
+ ps->context = NULL;
+}
+
+static int st_status(struct path_selector *ps, struct dm_path *path,
+ status_type_t type, char *result, unsigned int maxlen)
+{
+ int sz = 0;
+ struct path_info *pi;
+
+ if (!path)
+ DMEMIT("0 ");
+ else {
+ pi = path->pscontext;
+
+ switch (type) {
+ case STATUSTYPE_INFO:
+ DMEMIT("if:%08lu pf:%06lu ",
+ (unsigned long) atomic_read(&pi->in_flight),
+ pi->perf);
+ break;
+ case STATUSTYPE_TABLE:
+ DMEMIT("%u ", pi->repeat_count);
+ break;
+ }
+ }
+
+ return sz;
+}
+
+static int st_add_path(struct path_selector *ps, struct dm_path *path,
+ int argc, char **argv, char **error)
+{
+ struct selector *s = (struct selector *) ps->context;
+ struct path_info *pi;
+ unsigned int repeat_count = ST_MIN_IO;
+ struct gendisk *disk = path->dev->bdev->bd_disk;
+
+ if (argc > 1) {
+ *error = "service-time ps: incorrect number of arguments";
+ return -EINVAL;
+ }
+
+ /* First path argument is number of I/Os before switching path. */
+ if ((argc == 1) && (sscanf(argv[0], "%u", &repeat_count) != 1)) {
+ *error = "service-time ps: invalid repeat count";
+ return -EINVAL;
+ }
+
+ /* allocate the path */
+ pi = kmalloc(sizeof(*pi), GFP_KERNEL);
+ if (!pi) {
+ *error = "service-time ps: Error allocating path context";
+ return -ENOMEM;
+ }
+
+ pi->path = path;
+ pi->repeat_count = repeat_count;
+
+ pi->perf = 0;
+ pi->last_sectors = disk_stat_read(disk, sectors[READ])
+ + disk_stat_read(disk, sectors[WRITE]);
+ pi->last_io_ticks = disk_stat_read(disk, io_ticks);
+ atomic_set(&pi->in_flight, 0);
+
+ path->pscontext = pi;
+
+ list_add_tail(&pi->list, &s->valid_paths);
+
+ return 0;
+}
+
+static void st_fail_path(struct path_selector *ps, struct dm_path *path)
+{
+ struct selector *s = (struct selector *) ps->context;
+ struct path_info *pi = path->pscontext;
+
+ list_move(&pi->list, &s->failed_paths);
+}
+
+static int st_reinstate_path(struct path_selector *ps, struct dm_path *path)
+{
+ struct selector *s = (struct selector *) ps->context;
+ struct path_info *pi = path->pscontext;
+
+ list_move_tail(&pi->list, &s->valid_paths);
+
+ return 0;
+}
+
+static void stats_update(struct path_info *pi)
+{
+ sector_t sectors;
+ size_t io_ticks, tmp;
+ struct gendisk *disk = pi->path->dev->bdev->bd_disk;
+
+ sectors = disk_stat_read(disk, sectors[READ])
+ + disk_stat_read(disk, sectors[WRITE]);
+ io_ticks = disk_stat_read(disk, io_ticks);
+
+ if ((sectors != pi->last_sectors) && (io_ticks != pi->last_io_ticks)) {
+ tmp = (sectors - pi->last_sectors) << 9;
+ do_div(tmp, jiffies_to_msecs((io_ticks - pi->last_io_ticks)));
+ pi->perf = tmp;
+
+ pi->last_sectors = sectors;
+ pi->last_io_ticks = io_ticks;
+ }
+}
+
+static int st_compare_load(struct path_info *pi1, struct path_info *pi2,
+ size_t new_io)
+{
+ size_t if1, if2;
+
+ if1 = atomic_read(&pi1->in_flight);
+ if2 = atomic_read(&pi2->in_flight);
+
+ /*
+ * Case 1: No performace data available. Choose less loaded path.
+ */
+ if (!pi1->perf || !pi2->perf)
+ return if1 - if2;
+
+ /*
+ * Case 2: Calculate service time. Choose faster path.
+ * if ((if1+new_io)/pi1->perf < (if2+new_io)/pi2->perf) pi1.
+ * if ((if1+new_io)/pi1->perf > (if2+new_io)/pi2->perf) pi2.
+ * To avoid do_div(), use
+ * if ((if1+new_io)*pi2->perf < (if2+new_io)*pi1->perf) pi1.
+ * if ((if1+new_io)*pi2->perf > (if2+new_io)*pi1->perf) pi2.
+ */
+ if1 = (if1 + new_io) << 10;
+ if2 = (if2 + new_io) << 10;
+ do_div(if1, pi1->perf);
+ do_div(if2, pi2->perf);
+
+ if (if1 != if2)
+ return if1 - if2;
+
+ /*
+ * Case 3: Service time is equal. Choose faster path.
+ */
+ return pi2->perf - pi1->perf;
+}
+
+static struct dm_path *st_select_path(struct path_selector *ps,
+ unsigned *repeat_count, size_t nr_bytes)
+{
+ struct selector *s = (struct selector *) ps->context;
+ struct path_info *pi = NULL, *best = NULL;
+
+ if (list_empty(&s->valid_paths))
+ return NULL;
+
+ /* Change preferred (first in list) path to evenly balance. */
+ list_move_tail(s->valid_paths.next, &s->valid_paths);
+
+ /* Update performance information before best path selection */
+ list_for_each_entry(pi, &s->valid_paths, list)
+ stats_update(pi);
+
+ list_for_each_entry(pi, &s->valid_paths, list) {
+ if (!best)
+ best = pi;
+ else if (st_compare_load(pi, best, nr_bytes) < 0)
+ best = pi;
+ }
+
+ if (best) {
+ *repeat_count = best->repeat_count;
+ return best->path;
+ }
+
+ return NULL;
+}
+
+static int st_start_io(struct path_selector *ps, struct dm_path *path,
+ size_t nr_bytes)
+{
+ struct path_info *pi = path->pscontext;
+
+ atomic_add(nr_bytes, &pi->in_flight);
+
+ return 0;
+}
+
+static int st_end_io(struct path_selector *ps, struct dm_path *path,
+ size_t nr_bytes)
+{
+ struct path_info *pi = path->pscontext;
+
+ atomic_sub(nr_bytes, &pi->in_flight);
+
+ return 0;
+}
+
+static struct path_selector_type st_ps = {
+ .name = "service-time",
+ .module = THIS_MODULE,
+ .table_args = 1,
+ .info_args = 2,
+ .create = st_create,
+ .destroy = st_destroy,
+ .status = st_status,
+ .add_path = st_add_path,
+ .fail_path = st_fail_path,
+ .reinstate_path = st_reinstate_path,
+ .select_path = st_select_path,
+ .start_io = st_start_io,
+ .end_io = st_end_io,
+};
+
+static int __init dm_st_init(void)
+{
+ int r = dm_register_path_selector(&st_ps);
+
+ if (r < 0)
+ DMERR("register failed %d", r);
+
+ DMINFO("version " ST_VERSION " loaded");
+
+ return r;
+}
+
+static void __exit dm_st_exit(void)
+{
+ int r = dm_unregister_path_selector(&st_ps);
+
+ if (r < 0)
+ DMERR("unregister failed %d", r);
+}
+
+module_init(dm_st_init);
+module_exit(dm_st_exit);
+
+MODULE_DESCRIPTION(DM_NAME " throughput oriented path selector");
+MODULE_AUTHOR("Kiyoshi Ueda <k-ueda@ct.jp.nec.com>");
+MODULE_LICENSE("GPL");
Index: 2.6.27-rc6/drivers/md/Makefile
===================================================================
--- 2.6.27-rc6.orig/drivers/md/Makefile
+++ 2.6.27-rc6/drivers/md/Makefile
@@ -33,7 +33,7 @@ obj-$(CONFIG_BLK_DEV_DM) += dm-mod.o
obj-$(CONFIG_DM_CRYPT) += dm-crypt.o
obj-$(CONFIG_DM_DELAY) += dm-delay.o
obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o \
- dm-queue-length.o
+ dm-queue-length.o dm-service-time.o
obj-$(CONFIG_DM_SNAPSHOT) += dm-snapshot.o
obj-$(CONFIG_DM_MIRROR) += dm-mirror.o dm-log.o
obj-$(CONFIG_DM_ZERO) += dm-zero.o
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] dm-mpath: dynamic load balancers
2008-09-12 14:50 [RFC PATCH 0/4] dm-mpath: dynamic load balancers Kiyoshi Ueda
` (3 preceding siblings ...)
2008-09-12 14:53 ` [RFC PATCH 4/4] dm-mpath: add " Kiyoshi Ueda
@ 2009-01-28 15:37 ` Alasdair G Kergon
2009-01-29 7:16 ` Kiyoshi Ueda
4 siblings, 1 reply; 9+ messages in thread
From: Alasdair G Kergon @ 2009-01-28 15:37 UTC (permalink / raw)
To: Kiyoshi Ueda; +Cc: j-nomura, dm-devel, stefan.bader, linux-scsi
On Fri, Sep 12, 2008 at 10:50:38AM -0400, Kiyoshi Ueda wrote:
> The following patches add the following 2 dynamic load balancers
> to request-based dm-multipath:
> o queue-length oriented dynamic load balancer, dm-queue-length.
> o service-time oriented dynamic load balancer, dm-service-time.
Could we have separate Kconfig options to select them?
Alasdair
--
agk@redhat.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] dm-mpath: dynamic load balancers
2009-01-28 15:37 ` [RFC PATCH 0/4] dm-mpath: dynamic load balancers Alasdair G Kergon
@ 2009-01-29 7:16 ` Kiyoshi Ueda
2009-01-29 18:12 ` [dm-devel] " Mike Christie
0 siblings, 1 reply; 9+ messages in thread
From: Kiyoshi Ueda @ 2009-01-29 7:16 UTC (permalink / raw)
To: Alasdair Kergon; +Cc: dm-devel, stefan.bader, linux-scsi
Hi Alasdair,
On 01/29/2009 12:37 AM +0900, Alasdair G Kergon wrote:
> On Fri, Sep 12, 2008 at 10:50:38AM -0400, Kiyoshi Ueda wrote:
>> The following patches add the following 2 dynamic load balancers
>> to request-based dm-multipath:
>> o queue-length oriented dynamic load balancer, dm-queue-length.
>> o service-time oriented dynamic load balancer, dm-service-time.
>
> Could we have separate Kconfig options to select them?
Sure.
I posted the new one rebased on top of 2.6.29-rc2:
https://www.redhat.com/archives/dm-devel/2009-January/msg00183.html
https://www.redhat.com/archives/dm-devel/2009-January/msg00186.html
Thanks,
Kiyoshi Ueda
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dm-devel] [RFC PATCH 0/4] dm-mpath: dynamic load balancers
2009-01-29 7:16 ` Kiyoshi Ueda
@ 2009-01-29 18:12 ` Mike Christie
2009-01-30 8:03 ` Kiyoshi Ueda
0 siblings, 1 reply; 9+ messages in thread
From: Mike Christie @ 2009-01-29 18:12 UTC (permalink / raw)
To: device-mapper development; +Cc: Alasdair Kergon, stefan.bader, linux-scsi
Kiyoshi Ueda wrote:
> Hi Alasdair,
>
> On 01/29/2009 12:37 AM +0900, Alasdair G Kergon wrote:
>> On Fri, Sep 12, 2008 at 10:50:38AM -0400, Kiyoshi Ueda wrote:
>>> The following patches add the following 2 dynamic load balancers
>>> to request-based dm-multipath:
>>> o queue-length oriented dynamic load balancer, dm-queue-length.
>>> o service-time oriented dynamic load balancer, dm-service-time.
>>
>> Could we have separate Kconfig options to select them?
>
> Sure.
> I posted the new one rebased on top of 2.6.29-rc2:
> https://www.redhat.com/archives/dm-devel/2009-January/msg00183.html
> https://www.redhat.com/archives/dm-devel/2009-January/msg00186.html
>
For the Kconfig help, is there something we could add to help people
know when to choose one or the other? Sort of like how the block layer
io scheduler (block/Kconfig.iosched) says one might be helpful for
desktops and one is useful for data bses?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/4] dm-mpath: dynamic load balancers
2009-01-29 18:12 ` [dm-devel] " Mike Christie
@ 2009-01-30 8:03 ` Kiyoshi Ueda
0 siblings, 0 replies; 9+ messages in thread
From: Kiyoshi Ueda @ 2009-01-30 8:03 UTC (permalink / raw)
To: device-mapper development; +Cc: Alasdair Kergon, stefan.bader, linux-scsi
Hi Mike,
Thank you for the comment.
On 01/30/2009 03:12 AM +0900, Mike Christie wrote:
> Kiyoshi Ueda wrote:
>> Hi Alasdair,
>>
>> On 01/29/2009 12:37 AM +0900, Alasdair G Kergon wrote:
>>> On Fri, Sep 12, 2008 at 10:50:38AM -0400, Kiyoshi Ueda wrote:
>>>> The following patches add the following 2 dynamic load balancers
>>>> to request-based dm-multipath:
>>>> o queue-length oriented dynamic load balancer, dm-queue-length.
>>>> o service-time oriented dynamic load balancer, dm-service-time.
>>>
>>> Could we have separate Kconfig options to select them?
>>
>> Sure.
>> I posted the new one rebased on top of 2.6.29-rc2:
>> https://www.redhat.com/archives/dm-devel/2009-January/msg00183.html
>> https://www.redhat.com/archives/dm-devel/2009-January/msg00186.html
>>
>
> For the Kconfig help, is there something we could add to help people
> know when to choose one or the other? Sort of like how the block layer
> io scheduler (block/Kconfig.iosched) says one might be helpful for
> desktops and one is useful for data bses?
That's a good idea, but I don't have much information about that now.
To make such guide-line, we probably need lots of feedbacks from users.
(e.g. That may also depend on HW, not only work-load.)
So that is TODO in the future, I think.
Thanks,
Kiyoshi Ueda
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-01-30 8:03 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-12 14:50 [RFC PATCH 0/4] dm-mpath: dynamic load balancers Kiyoshi Ueda
2008-09-12 14:52 ` [RFC PATCH 1/4] dm-mpath: add a path selector interface Kiyoshi Ueda
2008-09-12 14:52 ` [RFC PATCH 2/4] dm-mpath: add queue-length oriented dynamic load balancer Kiyoshi Ueda
2008-09-12 14:53 ` [RFC PATCH 3/4] dm-mpath: interface change for service-time " Kiyoshi Ueda
2008-09-12 14:53 ` [RFC PATCH 4/4] dm-mpath: add " Kiyoshi Ueda
2009-01-28 15:37 ` [RFC PATCH 0/4] dm-mpath: dynamic load balancers Alasdair G Kergon
2009-01-29 7:16 ` Kiyoshi Ueda
2009-01-29 18:12 ` [dm-devel] " Mike Christie
2009-01-30 8:03 ` Kiyoshi Ueda
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox