All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/0] dm-mpath: service-time oriented dynamic load balancer
@ 2009-01-29  7:07 Kiyoshi Ueda
  2009-01-29  7:10 ` [RFC PATCH 1/2] dm-mpath: interface change for preparation Kiyoshi Ueda
  2009-01-29  7:12 ` [RFC PATCH 2/2] dm-mpath: add service-time oriented dynamic load balancer Kiyoshi Ueda
  0 siblings, 2 replies; 5+ messages in thread
From: Kiyoshi Ueda @ 2009-01-29  7:07 UTC (permalink / raw)
  To: device-mapper development

Hi,

The following patches add a service-time oriented dynamic load
balancer, dm-service-time, which selects a path to complete
the incoming I/O with the shortest time.

While the dm-queue-length path selector posted in the other thread
is ready for inclusion, I would rather like to hear comments from
other dm-multipath users and developers for dm-service-time before
pushing it to mainline.


dm-service-time stores the recent throughput information for each path.
Service time for a incoming I/O is estimated by:
  ("the total size of in-flight I/Os on the path" +
   "the size of the incoming I/O") / "the recent throughput"

There could be simpler path selection methods:
  - Size-based path selection.
    i.e. just use the total size of in-flight I/Os on the path.
    It is simpler and works fine if each path has same bandwidth,
    is evenly loaded and has no disturbance from external system.
  - Use user-specified performance value for each path.
    i.e. instead of calculating the throughput in kernel, the value
    is given from userspace.
    It works fine if the bandwidth of paths are asymmetric but
    there is no disturbance from external system.

dm-service-time is more adaptive than the above 2 methods.
However, if the simpler methods are preferred in some cases, I can
split the dm-service-time into the pieces to allow the simpler
approaches.


This patch-set can be applied on top of 2.6.29-rc2 + dm-queue-length
patches(*).

(*) https://www.redhat.com/archives/dm-devel/2009-January/msg00183.html

Summary of the patch-set:
  1/2: dm-mpath: interface change for preparation
  2/2: dm-mpath: add service-time oriented dynamic load balancer

Thanks,
Kiyoshi Ueda

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH 1/2] dm-mpath: interface change for preparation
  2009-01-29  7:07 [RFC PATCH 0/0] dm-mpath: service-time oriented dynamic load balancer Kiyoshi Ueda
@ 2009-01-29  7:10 ` Kiyoshi Ueda
  2009-01-30  4:42   ` Chauhan, Vijay
  2009-01-29  7:12 ` [RFC PATCH 2/2] dm-mpath: add service-time oriented dynamic load balancer Kiyoshi Ueda
  1 sibling, 1 reply; 5+ messages in thread
From: Kiyoshi Ueda @ 2009-01-29  7:10 UTC (permalink / raw)
  To: device-mapper development

This patch changes path selector interfaces for service-time oriented
dynamic load balancer.

To calculate the service time for an incoming I/O correctly,
the load balancer needs the size of the incoming I/O when selecting
the next path.


Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
---
 drivers/md/dm-mpath.c         |   27 ++++++++++++++++-----------
 drivers/md/dm-path-selector.h |    9 ++++++---
 drivers/md/dm-queue-length.c  |    8 +++++---
 drivers/md/dm-round-robin.c   |    2 +-
 4 files changed, 28 insertions(+), 18 deletions(-)

Index: 2.6.29-rc2/drivers/md/dm-mpath.c
===================================================================
--- 2.6.29-rc2.orig/drivers/md/dm-mpath.c
+++ 2.6.29-rc2/drivers/md/dm-mpath.c
@@ -103,6 +103,7 @@ struct multipath {
 struct dm_mpath_io {
 	struct pgpath *pgpath;
 	struct dm_bio_details details;
+	size_t nr_bytes;
 };
 
 typedef int (*action_fn) (struct pgpath *pgpath);
@@ -251,11 +252,12 @@ static void __switch_pg(struct multipath
 	m->pg_init_count = 0;
 }
 
-static int __choose_path_in_pg(struct multipath *m, struct priority_group *pg)
+static int __choose_path_in_pg(struct multipath *m, struct priority_group *pg,
+			       size_t nr_bytes)
 {
 	struct dm_path *path;
 
-	path = pg->ps.type->select_path(&pg->ps, &m->repeat_count);
+	path = pg->ps.type->select_path(&pg->ps, &m->repeat_count, nr_bytes);
 	if (!path)
 		return -ENXIO;
 
@@ -267,7 +269,7 @@ static int __choose_path_in_pg(struct mu
 	return 0;
 }
 
-static void __choose_pgpath(struct multipath *m)
+static void __choose_pgpath(struct multipath *m, size_t nr_bytes)
 {
 	struct priority_group *pg;
 	unsigned bypassed = 1;
@@ -279,12 +281,12 @@ static void __choose_pgpath(struct multi
 	if (m->next_pg) {
 		pg = m->next_pg;
 		m->next_pg = NULL;
-		if (!__choose_path_in_pg(m, pg))
+		if (!__choose_path_in_pg(m, pg, nr_bytes))
 			return;
 	}
 
 	/* Don't change PG until it has no remaining paths */
-	if (m->current_pg && !__choose_path_in_pg(m, m->current_pg))
+	if (m->current_pg && !__choose_path_in_pg(m, m->current_pg, nr_bytes))
 		return;
 
 	/*
@@ -296,7 +298,7 @@ static void __choose_pgpath(struct multi
 		list_for_each_entry(pg, &m->priority_groups, list) {
 			if (pg->bypassed == bypassed)
 				continue;
-			if (!__choose_path_in_pg(m, pg))
+			if (!__choose_path_in_pg(m, pg, nr_bytes))
 				return;
 		}
 	} while (bypassed--);
@@ -327,6 +329,7 @@ static int map_io(struct multipath *m, s
 		  struct dm_mpath_io *mpio, unsigned was_queued)
 {
 	int r = DM_MAPIO_REMAPPED;
+	size_t nr_bytes = bio->bi_size;
 	unsigned long flags;
 	struct pgpath *pgpath;
 
@@ -335,7 +338,7 @@ static int map_io(struct multipath *m, s
 	/* Do we need to select a new pgpath? */
 	if (!m->current_pgpath ||
 	    (!m->queue_io && (m->repeat_count && --m->repeat_count == 0)))
-		__choose_pgpath(m);
+		__choose_pgpath(m, nr_bytes);
 
 	pgpath = m->current_pgpath;
 
@@ -360,9 +363,11 @@ static int map_io(struct multipath *m, s
 		r = -EIO;	/* Failed */
 
 	mpio->pgpath = pgpath;
+	mpio->nr_bytes = nr_bytes;
 
 	if (r == DM_MAPIO_REMAPPED && pgpath->pg->ps.type->start_io)
-		pgpath->pg->ps.type->start_io(&pgpath->pg->ps, &pgpath->path);
+		pgpath->pg->ps.type->start_io(&pgpath->pg->ps, &pgpath->path,
+					      nr_bytes);
 
 	spin_unlock_irqrestore(&m->lock, flags);
 
@@ -441,7 +446,7 @@ static void process_queued_ios(struct wo
 		goto out;
 
 	if (!m->current_pgpath)
-		__choose_pgpath(m);
+		__choose_pgpath(m, 1 << 19); /* Assume 512 KB */
 
 	pgpath = m->current_pgpath;
 
@@ -1199,7 +1204,7 @@ static int multipath_end_io(struct dm_ta
 	if (pgpath) {
 		ps = &pgpath->pg->ps;
 		if (ps->type->end_io)
-			ps->type->end_io(ps, &pgpath->path);
+			ps->type->end_io(ps, &pgpath->path, mpio->nr_bytes);
 	}
 	if (r != DM_ENDIO_INCOMPLETE)
 		mempool_free(mpio, m->mpio_pool);
@@ -1415,7 +1420,7 @@ static int multipath_ioctl(struct dm_tar
 	spin_lock_irqsave(&m->lock, flags);
 
 	if (!m->current_pgpath)
-		__choose_pgpath(m);
+		__choose_pgpath(m, 1 << 19); /* Assume 512KB */
 
 	if (m->current_pgpath) {
 		bdev = m->current_pgpath->path.dev->bdev;
Index: 2.6.29-rc2/drivers/md/dm-path-selector.h
===================================================================
--- 2.6.29-rc2.orig/drivers/md/dm-path-selector.h
+++ 2.6.29-rc2/drivers/md/dm-path-selector.h
@@ -56,7 +56,8 @@ struct path_selector_type {
 	 * the path fails.
 	 */
 	struct dm_path *(*select_path) (struct path_selector *ps,
-				     unsigned *repeat_count);
+					unsigned *repeat_count,
+					size_t nr_bytes);
 
 	/*
 	 * Notify the selector that a path has failed.
@@ -75,8 +76,10 @@ struct path_selector_type {
 	int (*status) (struct path_selector *ps, struct dm_path *path,
 		       status_type_t type, char *result, unsigned int maxlen);
 
-	int (*start_io) (struct path_selector *ps, struct dm_path *path);
-	int (*end_io) (struct path_selector *ps, struct dm_path *path);
+	int (*start_io) (struct path_selector *ps, struct dm_path *path,
+			 size_t nr_bytes);
+	int (*end_io) (struct path_selector *ps, struct dm_path *path,
+		       size_t nr_bytes);
 };
 
 /* Register a path selector */
Index: 2.6.29-rc2/drivers/md/dm-round-robin.c
===================================================================
--- 2.6.29-rc2.orig/drivers/md/dm-round-robin.c
+++ 2.6.29-rc2/drivers/md/dm-round-robin.c
@@ -161,7 +161,7 @@ static int rr_reinstate_path(struct path
 }
 
 static struct dm_path *rr_select_path(struct path_selector *ps,
-				   unsigned *repeat_count)
+				      unsigned *repeat_count, size_t nr_bytes)
 {
 	struct selector *s = (struct selector *) ps->context;
 	struct path_info *pi = NULL;
Index: 2.6.29-rc2/drivers/md/dm-queue-length.c
===================================================================
--- 2.6.29-rc2.orig/drivers/md/dm-queue-length.c
+++ 2.6.29-rc2/drivers/md/dm-queue-length.c
@@ -138,7 +138,7 @@ static int ql_reinstate_path(struct path
 }
 
 static struct dm_path *ql_select_path(struct path_selector *ps,
-				      unsigned *repeat_count)
+				      unsigned *repeat_count, size_t nr_bytes)
 {
 	struct selector *s = (struct selector *) ps->context;
 	struct path_info *pi = NULL, *best = NULL;
@@ -166,7 +166,8 @@ static struct dm_path *ql_select_path(st
 	return best->path;
 }
 
-static int ql_start_io(struct path_selector *ps, struct dm_path *path)
+static int ql_start_io(struct path_selector *ps, struct dm_path *path,
+		       size_t nr_bytes)
 {
 	struct path_info *pi = path->pscontext;
 
@@ -175,7 +176,8 @@ static int ql_start_io(struct path_selec
 	return 0;
 }
 
-static int ql_end_io(struct path_selector *ps, struct dm_path *path)
+static int ql_end_io(struct path_selector *ps, struct dm_path *path,
+		     size_t nr_bytes)
 {
 	struct path_info *pi = path->pscontext;
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH 2/2] dm-mpath: add service-time oriented dynamic load balancer
  2009-01-29  7:07 [RFC PATCH 0/0] dm-mpath: service-time oriented dynamic load balancer Kiyoshi Ueda
  2009-01-29  7:10 ` [RFC PATCH 1/2] dm-mpath: interface change for preparation Kiyoshi Ueda
@ 2009-01-29  7:12 ` Kiyoshi Ueda
  1 sibling, 0 replies; 5+ messages in thread
From: Kiyoshi Ueda @ 2009-01-29  7:12 UTC (permalink / raw)
  To: device-mapper development

This patch adds a service time oriented dynamic load balancer,
dm-service-time, which selects a path to complete the incoming I/O
with the shortest time.
To calculate the service time, it uses the size of I/Os and
recent throughput of each path.


Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
---
 drivers/md/Kconfig           |    9 +
 drivers/md/Makefile          |    1 
 drivers/md/dm-service-time.c |  312 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 322 insertions(+)

Index: 2.6.29-rc2/drivers/md/dm-service-time.c
===================================================================
--- /dev/null
+++ 2.6.29-rc2/drivers/md/dm-service-time.c
@@ -0,0 +1,312 @@
+/*
+ * Copyright (C) 2007-2009 NEC Corporation.  All Rights Reserved.
+ *
+ * Module Author: Kiyoshi Ueda
+ *
+ * This file is released under the GPL.
+ *
+ * Throughput oriented path selector.
+ */
+
+#include "dm.h"
+#include "dm-path-selector.h"
+
+#define DM_MSG_PREFIX	"multipath service-time"
+#define ST_MIN_IO	2
+#define ST_VERSION	"0.1.0"
+
+struct selector {
+	struct list_head valid_paths;
+	struct list_head failed_paths;
+};
+
+struct path_info {
+	struct list_head list;
+	struct dm_path *path;
+	unsigned int repeat_count;
+
+	atomic_t in_flight;	/* Total size of in-flight I/Os */
+	size_t perf;		/* Recent performance of the path */
+	sector_t last_sectors;	/* Total sectors of the last part_stat_read */
+	size_t last_io_ticks;	/* io_ticks of the last part_stat_read */
+};
+
+static struct selector *alloc_selector(void)
+{
+	struct selector *s = kzalloc(sizeof(*s), GFP_KERNEL);
+
+	if (s) {
+		INIT_LIST_HEAD(&s->valid_paths);
+		INIT_LIST_HEAD(&s->failed_paths);
+	}
+
+	return s;
+}
+
+static int st_create(struct path_selector *ps, unsigned argc, char **argv)
+{
+	struct selector *s = alloc_selector();
+
+	if (!s)
+		return -ENOMEM;
+
+	ps->context = s;
+	return 0;
+}
+
+static void free_paths(struct list_head *paths)
+{
+	struct path_info *pi, *next;
+
+	list_for_each_entry_safe(pi, next, paths, list) {
+		list_del(&pi->list);
+		pi->path->pscontext = NULL;
+		kfree(pi);
+	}
+}
+
+static void st_destroy(struct path_selector *ps)
+{
+	struct selector *s = (struct selector *) ps->context;
+
+	free_paths(&s->valid_paths);
+	free_paths(&s->failed_paths);
+	kfree(s);
+	ps->context = NULL;
+}
+
+static int st_status(struct path_selector *ps, struct dm_path *path,
+		     status_type_t type, char *result, unsigned int maxlen)
+{
+	int sz = 0;
+	struct path_info *pi;
+
+	if (!path)
+		DMEMIT("0 ");
+	else {
+		pi = path->pscontext;
+
+		switch (type) {
+		case STATUSTYPE_INFO:
+			DMEMIT("if:%08lu pf:%06lu ",
+			       (unsigned long) atomic_read(&pi->in_flight),
+			       pi->perf);
+			break;
+		case STATUSTYPE_TABLE:
+			DMEMIT("%u ", pi->repeat_count);
+			break;
+		}
+	}
+
+	return sz;
+}
+
+static int st_add_path(struct path_selector *ps, struct dm_path *path,
+		       int argc, char **argv, char **error)
+{
+	struct selector *s = (struct selector *) ps->context;
+	struct path_info *pi;
+	unsigned int repeat_count = ST_MIN_IO;
+	struct gendisk *disk = path->dev->bdev->bd_disk;
+
+	if (argc > 1) {
+		*error = "service-time ps: incorrect number of arguments";
+		return -EINVAL;
+	}
+
+	/* First path argument is number of I/Os before switching path. */
+	if ((argc == 1) && (sscanf(argv[0], "%u", &repeat_count) != 1)) {
+		*error = "service-time ps: invalid repeat count";
+		return -EINVAL;
+	}
+
+	/* allocate the path */
+	pi = kmalloc(sizeof(*pi), GFP_KERNEL);
+	if (!pi) {
+		*error = "service-time ps: Error allocating path context";
+		return -ENOMEM;
+	}
+
+	pi->path = path;
+	pi->repeat_count = repeat_count;
+
+	pi->perf = 0;
+	pi->last_sectors = part_stat_read(&disk->part0, sectors[READ])
+			   + part_stat_read(&disk->part0, sectors[WRITE]);
+	pi->last_io_ticks = part_stat_read(&disk->part0, io_ticks);
+	atomic_set(&pi->in_flight, 0);
+
+	path->pscontext = pi;
+
+	list_add_tail(&pi->list, &s->valid_paths);
+
+	return 0;
+}
+
+static void st_fail_path(struct path_selector *ps, struct dm_path *path)
+{
+	struct selector *s = (struct selector *) ps->context;
+	struct path_info *pi = path->pscontext;
+
+	list_move(&pi->list, &s->failed_paths);
+}
+
+static int st_reinstate_path(struct path_selector *ps, struct dm_path *path)
+{
+	struct selector *s = (struct selector *) ps->context;
+	struct path_info *pi = path->pscontext;
+
+	list_move_tail(&pi->list, &s->valid_paths);
+
+	return 0;
+}
+
+static void stats_update(struct path_info *pi)
+{
+	sector_t sectors;
+	size_t io_ticks, tmp;
+	struct gendisk *disk = pi->path->dev->bdev->bd_disk;
+
+	sectors = part_stat_read(&disk->part0, sectors[READ])
+		  + part_stat_read(&disk->part0, sectors[WRITE]);
+	io_ticks = part_stat_read(&disk->part0, io_ticks);
+
+	if ((sectors != pi->last_sectors) && (io_ticks != pi->last_io_ticks)) {
+		tmp = (sectors - pi->last_sectors) << 9;
+		do_div(tmp, jiffies_to_msecs((io_ticks - pi->last_io_ticks)));
+		pi->perf = tmp;
+
+		pi->last_sectors = sectors;
+		pi->last_io_ticks = io_ticks;
+	}
+}
+
+static int st_compare_load(struct path_info *pi1, struct path_info *pi2,
+			   size_t new_io)
+{
+	size_t if1, if2;
+
+	if1 = atomic_read(&pi1->in_flight);
+	if2 = atomic_read(&pi2->in_flight);
+
+	/*
+	 * Case 1: No performace data available. Choose less loaded path.
+	 */
+	if (!pi1->perf || !pi2->perf)
+		return if1 - if2;
+
+	/*
+	 * Case 2: Calculate service time. Choose faster path.
+	 *           if ((if1+new_io)/pi1->perf < (if2+new_io)/pi2->perf) pi1.
+	 *           if ((if1+new_io)/pi1->perf > (if2+new_io)/pi2->perf) pi2.
+	 *         To avoid do_div(), use
+	 *           if ((if1+new_io)*pi2->perf < (if2+new_io)*pi1->perf) pi1.
+	 *           if ((if1+new_io)*pi2->perf > (if2+new_io)*pi1->perf) pi2.
+	 */
+	if1 = (if1 + new_io) << 10;
+	if2 = (if2 + new_io) << 10;
+	do_div(if1, pi1->perf);
+	do_div(if2, pi2->perf);
+
+	if (if1 != if2)
+		return if1 - if2;
+
+	/*
+	 * Case 3: Service time is equal. Choose faster path.
+	 */
+	return pi2->perf - pi1->perf;
+}
+
+static struct dm_path *st_select_path(struct path_selector *ps,
+				      unsigned *repeat_count, size_t nr_bytes)
+{
+	struct selector *s = (struct selector *) ps->context;
+	struct path_info *pi = NULL, *best = NULL;
+
+	if (list_empty(&s->valid_paths))
+		return NULL;
+
+	/* Change preferred (first in list) path to evenly balance. */
+	list_move_tail(s->valid_paths.next, &s->valid_paths);
+
+	/* Update performance information before best path selection */
+	list_for_each_entry(pi, &s->valid_paths, list)
+		stats_update(pi);
+
+	list_for_each_entry(pi, &s->valid_paths, list) {
+		if (!best)
+			best = pi;
+		else if (st_compare_load(pi, best, nr_bytes) < 0)
+			best = pi;
+	}
+
+	if (best) {
+		*repeat_count = best->repeat_count;
+		return best->path;
+	}
+
+	return NULL;
+}
+
+static int st_start_io(struct path_selector *ps, struct dm_path *path,
+		       size_t nr_bytes)
+{
+	struct path_info *pi = path->pscontext;
+
+	atomic_add(nr_bytes, &pi->in_flight);
+
+	return 0;
+}
+
+static int st_end_io(struct path_selector *ps, struct dm_path *path,
+		     size_t nr_bytes)
+{
+	struct path_info *pi = path->pscontext;
+
+	atomic_sub(nr_bytes, &pi->in_flight);
+
+	return 0;
+}
+
+static struct path_selector_type st_ps = {
+	.name		= "service-time",
+	.module		= THIS_MODULE,
+	.table_args	= 1,
+	.info_args	= 2,
+	.create		= st_create,
+	.destroy	= st_destroy,
+	.status		= st_status,
+	.add_path	= st_add_path,
+	.fail_path	= st_fail_path,
+	.reinstate_path	= st_reinstate_path,
+	.select_path	= st_select_path,
+	.start_io	= st_start_io,
+	.end_io		= st_end_io,
+};
+
+static int __init dm_st_init(void)
+{
+	int r = dm_register_path_selector(&st_ps);
+
+	if (r < 0)
+		DMERR("register failed %d", r);
+
+	DMINFO("version " ST_VERSION " loaded");
+
+	return r;
+}
+
+static void __exit dm_st_exit(void)
+{
+	int r = dm_unregister_path_selector(&st_ps);
+
+	if (r < 0)
+		DMERR("unregister failed %d", r);
+}
+
+module_init(dm_st_init);
+module_exit(dm_st_exit);
+
+MODULE_DESCRIPTION(DM_NAME " throughput oriented path selector");
+MODULE_AUTHOR("Kiyoshi Ueda <k-ueda@ct.jp.nec.com>");
+MODULE_LICENSE("GPL");
Index: 2.6.29-rc2/drivers/md/Makefile
===================================================================
--- 2.6.29-rc2.orig/drivers/md/Makefile
+++ 2.6.29-rc2/drivers/md/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_DM_CRYPT)		+= dm-crypt.o
 obj-$(CONFIG_DM_DELAY)		+= dm-delay.o
 obj-$(CONFIG_DM_MULTIPATH)	+= dm-multipath.o dm-round-robin.o
 obj-$(CONFIG_DM_MULTIPATH_QL)	+= dm-queue-length.o
+obj-$(CONFIG_DM_MULTIPATH_ST)	+= dm-service-time.o
 obj-$(CONFIG_DM_SNAPSHOT)	+= dm-snapshot.o
 obj-$(CONFIG_DM_MIRROR)		+= dm-mirror.o dm-log.o dm-region-hash.o
 obj-$(CONFIG_DM_ZERO)		+= dm-zero.o
Index: 2.6.29-rc2/drivers/md/Kconfig
===================================================================
--- 2.6.29-rc2.orig/drivers/md/Kconfig
+++ 2.6.29-rc2/drivers/md/Kconfig
@@ -283,6 +283,15 @@ config DM_MULTIPATH_QL
 
 	  If unsure, say N.
 
+config DM_MULTIPATH_ST
+	tristate "I/O Path Selector based on the service time"
+	depends on DM_MULTIPATH
+	---help---
+	  This path selector is a dynamic load balancer which selects
+	  a path to complete the incoming I/O with the shortest time.
+
+	  If unsure, say N.
+
 config DM_DELAY
 	tristate "I/O delaying target (EXPERIMENTAL)"
 	depends on BLK_DEV_DM && EXPERIMENTAL

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [RFC PATCH 1/2] dm-mpath: interface change for preparation
  2009-01-29  7:10 ` [RFC PATCH 1/2] dm-mpath: interface change for preparation Kiyoshi Ueda
@ 2009-01-30  4:42   ` Chauhan, Vijay
  2009-01-30  8:06     ` Kiyoshi Ueda
  0 siblings, 1 reply; 5+ messages in thread
From: Chauhan, Vijay @ 2009-01-30  4:42 UTC (permalink / raw)
  To: Kiyoshi Ueda, Alasdair G Kergon; +Cc: device-mapper development

Hi Kiyoshi/Alasdair,

Just a thought. As we are changing __choose_pgpath parameters in this patch, will it be good to use BIO rather than nr_bytes as parameter? Availability of BIO in loadbalancing interface can provide more flexibility for coming up with new load balancing in future.


Thanks,
Vijay

-----Original Message-----
From: dm-devel-bounces@redhat.com [mailto:dm-devel-bounces@redhat.com] On Behalf Of Kiyoshi Ueda
Sent: Thursday, January 29, 2009 12:40 PM
To: device-mapper development
Subject: [dm-devel] [RFC PATCH 1/2] dm-mpath: interface change for preparation

This patch changes path selector interfaces for service-time oriented dynamic load balancer.

To calculate the service time for an incoming I/O correctly, the load balancer needs the size of the incoming I/O when selecting the next path.


Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
---
 drivers/md/dm-mpath.c         |   27 ++++++++++++++++-----------
 drivers/md/dm-path-selector.h |    9 ++++++---
 drivers/md/dm-queue-length.c  |    8 +++++---
 drivers/md/dm-round-robin.c   |    2 +-
 4 files changed, 28 insertions(+), 18 deletions(-)

Index: 2.6.29-rc2/drivers/md/dm-mpath.c
===================================================================
--- 2.6.29-rc2.orig/drivers/md/dm-mpath.c
+++ 2.6.29-rc2/drivers/md/dm-mpath.c
@@ -103,6 +103,7 @@ struct multipath {
 struct dm_mpath_io {
        struct pgpath *pgpath;
        struct dm_bio_details details;
+       size_t nr_bytes;
 };

 typedef int (*action_fn) (struct pgpath *pgpath); @@ -251,11 +252,12 @@ static void __switch_pg(struct multipath
        m->pg_init_count = 0;
 }

-static int __choose_path_in_pg(struct multipath *m, struct priority_group *pg)
+static int __choose_path_in_pg(struct multipath *m, struct priority_group *pg,
+                              size_t nr_bytes)
 {
        struct dm_path *path;

-       path = pg->ps.type->select_path(&pg->ps, &m->repeat_count);
+       path = pg->ps.type->select_path(&pg->ps, &m->repeat_count,
+ nr_bytes);
        if (!path)
                return -ENXIO;

@@ -267,7 +269,7 @@ static int __choose_path_in_pg(struct mu
        return 0;
 }

-static void __choose_pgpath(struct multipath *m)
+static void __choose_pgpath(struct multipath *m, size_t nr_bytes)
 {
        struct priority_group *pg;
        unsigned bypassed = 1;
@@ -279,12 +281,12 @@ static void __choose_pgpath(struct multi
        if (m->next_pg) {
                pg = m->next_pg;
                m->next_pg = NULL;
-               if (!__choose_path_in_pg(m, pg))
+               if (!__choose_path_in_pg(m, pg, nr_bytes))
                        return;
        }

        /* Don't change PG until it has no remaining paths */
-       if (m->current_pg && !__choose_path_in_pg(m, m->current_pg))
+       if (m->current_pg && !__choose_path_in_pg(m, m->current_pg,
+ nr_bytes))
                return;

        /*
@@ -296,7 +298,7 @@ static void __choose_pgpath(struct multi
                list_for_each_entry(pg, &m->priority_groups, list) {
                        if (pg->bypassed == bypassed)
                                continue;
-                       if (!__choose_path_in_pg(m, pg))
+                       if (!__choose_path_in_pg(m, pg, nr_bytes))
                                return;
                }
        } while (bypassed--);
@@ -327,6 +329,7 @@ static int map_io(struct multipath *m, s
                  struct dm_mpath_io *mpio, unsigned was_queued)  {
        int r = DM_MAPIO_REMAPPED;
+       size_t nr_bytes = bio->bi_size;
        unsigned long flags;
        struct pgpath *pgpath;

@@ -335,7 +338,7 @@ static int map_io(struct multipath *m, s
        /* Do we need to select a new pgpath? */
        if (!m->current_pgpath ||
            (!m->queue_io && (m->repeat_count && --m->repeat_count == 0)))
-               __choose_pgpath(m);
+               __choose_pgpath(m, nr_bytes);

        pgpath = m->current_pgpath;

@@ -360,9 +363,11 @@ static int map_io(struct multipath *m, s
                r = -EIO;       /* Failed */

        mpio->pgpath = pgpath;
+       mpio->nr_bytes = nr_bytes;

        if (r == DM_MAPIO_REMAPPED && pgpath->pg->ps.type->start_io)
-               pgpath->pg->ps.type->start_io(&pgpath->pg->ps, &pgpath->path);
+               pgpath->pg->ps.type->start_io(&pgpath->pg->ps, &pgpath->path,
+                                             nr_bytes);

        spin_unlock_irqrestore(&m->lock, flags);

@@ -441,7 +446,7 @@ static void process_queued_ios(struct wo
                goto out;

        if (!m->current_pgpath)
-               __choose_pgpath(m);
+               __choose_pgpath(m, 1 << 19); /* Assume 512 KB */

        pgpath = m->current_pgpath;

@@ -1199,7 +1204,7 @@ static int multipath_end_io(struct dm_ta
        if (pgpath) {
                ps = &pgpath->pg->ps;
                if (ps->type->end_io)
-                       ps->type->end_io(ps, &pgpath->path);
+                       ps->type->end_io(ps, &pgpath->path,
+ mpio->nr_bytes);
        }
        if (r != DM_ENDIO_INCOMPLETE)
                mempool_free(mpio, m->mpio_pool); @@ -1415,7 +1420,7 @@ static int multipath_ioctl(struct dm_tar
        spin_lock_irqsave(&m->lock, flags);

        if (!m->current_pgpath)
-               __choose_pgpath(m);
+               __choose_pgpath(m, 1 << 19); /* Assume 512KB */

        if (m->current_pgpath) {
                bdev = m->current_pgpath->path.dev->bdev;
Index: 2.6.29-rc2/drivers/md/dm-path-selector.h
===================================================================
--- 2.6.29-rc2.orig/drivers/md/dm-path-selector.h
+++ 2.6.29-rc2/drivers/md/dm-path-selector.h
@@ -56,7 +56,8 @@ struct path_selector_type {
         * the path fails.
         */
        struct dm_path *(*select_path) (struct path_selector *ps,
-                                    unsigned *repeat_count);
+                                       unsigned *repeat_count,
+                                       size_t nr_bytes);

        /*
         * Notify the selector that a path has failed.
@@ -75,8 +76,10 @@ struct path_selector_type {
        int (*status) (struct path_selector *ps, struct dm_path *path,
                       status_type_t type, char *result, unsigned int maxlen);

-       int (*start_io) (struct path_selector *ps, struct dm_path *path);
-       int (*end_io) (struct path_selector *ps, struct dm_path *path);
+       int (*start_io) (struct path_selector *ps, struct dm_path *path,
+                        size_t nr_bytes);
+       int (*end_io) (struct path_selector *ps, struct dm_path *path,
+                      size_t nr_bytes);
 };

 /* Register a path selector */
Index: 2.6.29-rc2/drivers/md/dm-round-robin.c
===================================================================
--- 2.6.29-rc2.orig/drivers/md/dm-round-robin.c
+++ 2.6.29-rc2/drivers/md/dm-round-robin.c
@@ -161,7 +161,7 @@ static int rr_reinstate_path(struct path  }

 static struct dm_path *rr_select_path(struct path_selector *ps,
-                                  unsigned *repeat_count)
+                                     unsigned *repeat_count, size_t
+ nr_bytes)
 {
        struct selector *s = (struct selector *) ps->context;
        struct path_info *pi = NULL;
Index: 2.6.29-rc2/drivers/md/dm-queue-length.c
===================================================================
--- 2.6.29-rc2.orig/drivers/md/dm-queue-length.c
+++ 2.6.29-rc2/drivers/md/dm-queue-length.c
@@ -138,7 +138,7 @@ static int ql_reinstate_path(struct path  }

 static struct dm_path *ql_select_path(struct path_selector *ps,
-                                     unsigned *repeat_count)
+                                     unsigned *repeat_count, size_t
+ nr_bytes)
 {
        struct selector *s = (struct selector *) ps->context;
        struct path_info *pi = NULL, *best = NULL; @@ -166,7 +166,8 @@ static struct dm_path *ql_select_path(st
        return best->path;
 }

-static int ql_start_io(struct path_selector *ps, struct dm_path *path)
+static int ql_start_io(struct path_selector *ps, struct dm_path *path,
+                      size_t nr_bytes)
 {
        struct path_info *pi = path->pscontext;

@@ -175,7 +176,8 @@ static int ql_start_io(struct path_selec
        return 0;
 }

-static int ql_end_io(struct path_selector *ps, struct dm_path *path)
+static int ql_end_io(struct path_selector *ps, struct dm_path *path,
+                    size_t nr_bytes)
 {
        struct path_info *pi = path->pscontext;


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/2] dm-mpath: interface change for preparation
  2009-01-30  4:42   ` Chauhan, Vijay
@ 2009-01-30  8:06     ` Kiyoshi Ueda
  0 siblings, 0 replies; 5+ messages in thread
From: Kiyoshi Ueda @ 2009-01-30  8:06 UTC (permalink / raw)
  To: Chauhan, Vijay; +Cc: device-mapper development, Alasdair G Kergon

Hi Vijay,

Thank you for the comment.

On 01/30/2009 01:42 PM +0900, Chauhan, Vijay wrote:
> Hi Kiyoshi/Alasdair,
> 
> Just a thought. As we are changing __choose_pgpath parameters
> in this patch, will it be good to use BIO rather than nr_bytes
> as parameter? Availability of BIO in loadbalancing interface can
> provide more flexibility for coming up with new load balancing
> in future.

I understand that very much.
But I'd like to keep independency of path-selector from the type
of I/O structure as much as possible, since both bio-based targets
and request-based targets may want to use the same path-selector
in the future.  (e.g. multipath, which is requset-based,
and mirror, which is bio-based.)

So if we need only a few arguments, non-structured parameters
would be good, I think.
Do you have any idea what other parameters in BIO are useful for
load balancing decision?

Thanks,
Kiyoshi Ueda

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-01-30  8:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-29  7:07 [RFC PATCH 0/0] dm-mpath: service-time oriented dynamic load balancer Kiyoshi Ueda
2009-01-29  7:10 ` [RFC PATCH 1/2] dm-mpath: interface change for preparation Kiyoshi Ueda
2009-01-30  4:42   ` Chauhan, Vijay
2009-01-30  8:06     ` Kiyoshi Ueda
2009-01-29  7:12 ` [RFC PATCH 2/2] dm-mpath: add service-time oriented dynamic load balancer Kiyoshi Ueda

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.