All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv4 0/6] dm-multipath: push back requests instead of queueing
@ 2014-02-03  8:18 Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 1/6] dm-multipath: Do not call pg_init twice Hannes Reinecke
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03  8:18 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: Jun'ichi Nomura, dm-devel, Mike Snitzer

Hi all,

dm-multipath still carries around it's own queueing framework for
implementing 'queue_if_no_path'.
However, there is no real reason for this; we could as well
push back the requests onto the request_queue.
In doing so we can also reduce the memory pressure during
fail_if_no_path scenarios, as we don't have to allocate a
context for each request when it need to be requeued.

Changes since v3:
- Include dm_md_get_queue(), as suggested from Mike Snitzer
- Call __pg_init_all_paths() in pg_init_done() to handle
  pg_init_required correctly, as suggested by Jun'ichi Nomura

Hannes Reinecke (5):
  dm-multipath: Do not call pg_init twice
  dm-multipath: push back requests instead of queueing
  dm-multipath: remove process_queued_ios()
  dm-multipath: reduce memory pressure during requeuing
  dm-multipath: remove map_io()

Mike Snitzer (1):
  dm: implement dm_md_get_queue()

 drivers/md/dm-mpath.c         | 204 +++++++++++++-----------------------------
 drivers/md/dm-table.c         |  14 +++
 drivers/md/dm.c               |   5 ++
 drivers/md/dm.h               |   1 +
 include/linux/device-mapper.h |   5 ++
 5 files changed, 89 insertions(+), 140 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/6] dm-multipath: Do not call pg_init twice
  2014-02-03  8:18 [PATCHv4 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
@ 2014-02-03  8:18 ` Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 2/6] dm: implement dm_md_get_queue() Hannes Reinecke
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03  8:18 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: Jun'ichi Nomura, dm-devel, Mike Snitzer

When pg_init is running we shouldn't be calling the same
routine twice; we need to wait for the first pg_init to
complete.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-mpath.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 6eb9dc9..d45290a 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -261,6 +261,9 @@ static void __pg_init_all_paths(struct multipath *m)
 	struct pgpath *pgpath;
 	unsigned long pg_init_delay = 0;
 
+	if (m->pg_init_in_progress || m->pg_init_disabled)
+		return;
+
 	m->pg_init_count++;
 	m->pg_init_required = 0;
 	if (m->pg_init_delay_retry)
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/6] dm: implement dm_md_get_queue()
  2014-02-03  8:18 [PATCHv4 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 1/6] dm-multipath: Do not call pg_init twice Hannes Reinecke
@ 2014-02-03  8:18 ` Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 3/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03  8:18 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: Jun'ichi Nomura, dm-devel, Mike Snitzer

From: Mike Snitzer <snitzer@redhat.com>

Add wrapper to extract the request_queue from a
mapped device.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm.c | 5 +++++
 drivers/md/dm.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 0704c52..933b8cb 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -475,6 +475,11 @@ sector_t dm_get_size(struct mapped_device *md)
 	return get_capacity(md->disk);
 }
 
+struct request_queue *dm_get_md_queue(struct mapped_device *md)
+{
+	return md->queue;
+}
+
 struct dm_stats *dm_get_stats(struct mapped_device *md)
 {
 	return &md->stats;
diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index c57ba55..0eef13a 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -172,6 +172,7 @@ int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred, bool only
 int dm_cancel_deferred_remove(struct mapped_device *md);
 int dm_request_based(struct mapped_device *md);
 sector_t dm_get_size(struct mapped_device *md);
+struct request_queue *dm_get_md_queue(struct mapped_device *md);
 struct dm_stats *dm_get_stats(struct mapped_device *md);
 
 int dm_kobject_uevent(struct mapped_device *md, enum kobject_action action,
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/6] dm-multipath: push back requests instead of queueing
  2014-02-03  8:18 [PATCHv4 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 1/6] dm-multipath: Do not call pg_init twice Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 2/6] dm: implement dm_md_get_queue() Hannes Reinecke
@ 2014-02-03  8:18 ` Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 4/6] dm-multipath: remove process_queued_ios() Hannes Reinecke
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03  8:18 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: Jun'ichi Nomura, dm-devel, Mike Snitzer

There is no reason why multipath needs to queue requests
internally for queue_if_no_path or pg_init; we should
rather push them back onto the request queue.

And while we're at it we can simplify the conditional
statement in map_io() to make it easier to read.

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-mpath.c         | 115 ++++++++++++++----------------------------
 drivers/md/dm-table.c         |  14 +++++
 include/linux/device-mapper.h |   5 ++
 3 files changed, 57 insertions(+), 77 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index d45290a..5373ca9 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -93,9 +93,7 @@ struct multipath {
 	unsigned pg_init_count;		/* Number of times pg_init called */
 	unsigned pg_init_delay_msecs;	/* Number of msecs before pg_init retry */
 
-	unsigned queue_size;
 	struct work_struct process_queued_ios;
-	struct list_head queued_ios;
 
 	struct work_struct trigger_event;
 
@@ -124,6 +122,7 @@ static struct workqueue_struct *kmultipathd, *kmpath_handlerd;
 static void process_queued_ios(struct work_struct *work);
 static void trigger_event(struct work_struct *work);
 static void activate_path(struct work_struct *work);
+static int __pgpath_busy(struct pgpath *pgpath);
 
 
 /*-----------------------------------------------
@@ -195,7 +194,6 @@ static struct multipath *alloc_multipath(struct dm_target *ti)
 	m = kzalloc(sizeof(*m), GFP_KERNEL);
 	if (m) {
 		INIT_LIST_HEAD(&m->priority_groups);
-		INIT_LIST_HEAD(&m->queued_ios);
 		spin_lock_init(&m->lock);
 		m->queue_io = 1;
 		m->pg_init_delay_msecs = DM_PG_INIT_DELAY_DEFAULT;
@@ -368,12 +366,15 @@ failed:
  */
 static int __must_push_back(struct multipath *m)
 {
-	return (m->queue_if_no_path != m->saved_queue_if_no_path &&
-		dm_noflush_suspending(m->ti));
+	return (m->queue_if_no_path ||
+		(m->queue_if_no_path != m->saved_queue_if_no_path &&
+		 dm_noflush_suspending(m->ti)));
 }
 
+#define pg_ready(m) (!(m)->queue_io && !(m)->pg_init_required)
+
 static int map_io(struct multipath *m, struct request *clone,
-		  union map_info *map_context, unsigned was_queued)
+		  union map_info *map_context)
 {
 	int r = DM_MAPIO_REMAPPED;
 	size_t nr_bytes = blk_rq_bytes(clone);
@@ -391,37 +392,30 @@ static int map_io(struct multipath *m, struct request *clone,
 
 	pgpath = m->current_pgpath;
 
-	if (was_queued)
-		m->queue_size--;
-
-	if (m->pg_init_required) {
-		if (!m->pg_init_in_progress)
-			queue_work(kmultipathd, &m->process_queued_ios);
-		r = DM_MAPIO_REQUEUE;
-	} else if ((pgpath && m->queue_io) ||
-		   (!pgpath && m->queue_if_no_path)) {
-		/* Queue for the daemon to resubmit */
-		list_add_tail(&clone->queuelist, &m->queued_ios);
-		m->queue_size++;
-		if (!m->queue_io)
-			queue_work(kmultipathd, &m->process_queued_ios);
-		pgpath = NULL;
-		r = DM_MAPIO_SUBMITTED;
-	} else if (pgpath) {
-		bdev = pgpath->path.dev->bdev;
-		clone->q = bdev_get_queue(bdev);
-		clone->rq_disk = bdev->bd_disk;
-	} else if (__must_push_back(m))
-		r = DM_MAPIO_REQUEUE;
-	else
-		r = -EIO;	/* Failed */
-
-	mpio->pgpath = pgpath;
-	mpio->nr_bytes = nr_bytes;
-
-	if (r == DM_MAPIO_REMAPPED && pgpath->pg->ps.type->start_io)
-		pgpath->pg->ps.type->start_io(&pgpath->pg->ps, &pgpath->path,
-					      nr_bytes);
+	if (pgpath) {
+		if (__pgpath_busy(pgpath))
+			r = DM_MAPIO_REQUEUE;
+		else if (pg_ready(m)) {
+			bdev = pgpath->path.dev->bdev;
+			clone->q = bdev_get_queue(bdev);
+			clone->rq_disk = bdev->bd_disk;
+			mpio->pgpath = pgpath;
+			mpio->nr_bytes = nr_bytes;
+			if (pgpath->pg->ps.type->start_io)
+				pgpath->pg->ps.type->start_io(&pgpath->pg->ps,
+							      &pgpath->path,
+							      nr_bytes);
+		} else {
+			__pg_init_all_paths(m);
+			r = DM_MAPIO_REQUEUE;
+		}
+	} else {
+		/* No path */
+		if (__must_push_back(m))
+			r = DM_MAPIO_REQUEUE;
+		else
+			r = -EIO;	/* Failed */
+	}
 
 	spin_unlock_irqrestore(&m->lock, flags);
 
@@ -443,7 +437,7 @@ static int queue_if_no_path(struct multipath *m, unsigned queue_if_no_path,
 	else
 		m->saved_queue_if_no_path = queue_if_no_path;
 	m->queue_if_no_path = queue_if_no_path;
-	if (!m->queue_if_no_path && m->queue_size)
+	if (!m->queue_if_no_path)
 		queue_work(kmultipathd, &m->process_queued_ios);
 
 	spin_unlock_irqrestore(&m->lock, flags);
@@ -451,40 +445,6 @@ static int queue_if_no_path(struct multipath *m, unsigned queue_if_no_path,
 	return 0;
 }
 
-/*-----------------------------------------------------------------
- * The multipath daemon is responsible for resubmitting queued ios.
- *---------------------------------------------------------------*/
-
-static void dispatch_queued_ios(struct multipath *m)
-{
-	int r;
-	unsigned long flags;
-	union map_info *info;
-	struct request *clone, *n;
-	LIST_HEAD(cl);
-
-	spin_lock_irqsave(&m->lock, flags);
-	list_splice_init(&m->queued_ios, &cl);
-	spin_unlock_irqrestore(&m->lock, flags);
-
-	list_for_each_entry_safe(clone, n, &cl, queuelist) {
-		list_del_init(&clone->queuelist);
-
-		info = dm_get_rq_mapinfo(clone);
-
-		r = map_io(m, clone, info, 1);
-		if (r < 0) {
-			clear_mapinfo(m, info);
-			dm_kill_unmapped_request(clone, r);
-		} else if (r == DM_MAPIO_REMAPPED)
-			dm_dispatch_request(clone);
-		else if (r == DM_MAPIO_REQUEUE) {
-			clear_mapinfo(m, info);
-			dm_requeue_unmapped_request(clone);
-		}
-	}
-}
-
 static void process_queued_ios(struct work_struct *work)
 {
 	struct multipath *m =
@@ -510,7 +470,7 @@ static void process_queued_ios(struct work_struct *work)
 
 	spin_unlock_irqrestore(&m->lock, flags);
 	if (!must_queue)
-		dispatch_queued_ios(m);
+		dm_table_run_md_queue_async(m->ti->table);
 }
 
 /*
@@ -988,7 +948,7 @@ static int multipath_map(struct dm_target *ti, struct request *clone,
 		return DM_MAPIO_REQUEUE;
 
 	clone->cmd_flags |= REQ_FAILFAST_TRANSPORT;
-	r = map_io(m, clone, map_context, 0);
+	r = map_io(m, clone, map_context);
 	if (r < 0 || r == DM_MAPIO_REQUEUE)
 		clear_mapinfo(m, map_context);
 
@@ -1057,7 +1017,7 @@ static int reinstate_path(struct pgpath *pgpath)
 
 	pgpath->is_active = 1;
 
-	if (!m->nr_valid_paths++ && m->queue_size) {
+	if (!m->nr_valid_paths++) {
 		m->current_pgpath = NULL;
 		queue_work(kmultipathd, &m->process_queued_ios);
 	} else if (m->hw_handler_name && (m->current_pg == pgpath->pg)) {
@@ -1436,7 +1396,8 @@ static void multipath_status(struct dm_target *ti, status_type_t type,
 
 	/* Features */
 	if (type == STATUSTYPE_INFO)
-		DMEMIT("2 %u %u ", m->queue_size, m->pg_init_count);
+		DMEMIT("2 %u %u ", (m->queue_io << 1) + m->queue_if_no_path,
+		       m->pg_init_count);
 	else {
 		DMEMIT("%u ", m->queue_if_no_path +
 			      (m->pg_init_retries > 0) * 2 +
@@ -1684,7 +1645,7 @@ static int multipath_busy(struct dm_target *ti)
 	spin_lock_irqsave(&m->lock, flags);
 
 	/* pg_init in progress, requeue until done */
-	if (m->pg_init_in_progress) {
+	if (!pg_ready(m)) {
 		busy = 1;
 		goto out;
 	}
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 3ba6a38..c060f7f 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1636,6 +1636,20 @@ struct mapped_device *dm_table_get_md(struct dm_table *t)
 }
 EXPORT_SYMBOL(dm_table_get_md);
 
+void dm_table_run_md_queue_async(struct dm_table *t)
+{
+	struct mapped_device *md = dm_table_get_md(t);
+	struct request_queue *queue = dm_get_md_queue(md);
+	unsigned long flags;
+
+	if (queue) {
+		spin_lock_irqsave(queue->queue_lock, flags);
+		blk_run_queue_async(queue);
+		spin_unlock_irqrestore(queue->queue_lock, flags);
+	}
+}
+EXPORT_SYMBOL(dm_table_run_md_queue_async);
+
 static int device_discard_capable(struct dm_target *ti, struct dm_dev *dev,
 				  sector_t start, sector_t len, void *data)
 {
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index ed419c6..139e647 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -466,6 +466,11 @@ struct mapped_device *dm_table_get_md(struct dm_table *t);
 void dm_table_event(struct dm_table *t);
 
 /*
+ * Run the queue for request-based targets
+ */
+void dm_table_run_md_queue_async(struct dm_table *t);
+
+/*
  * The device must be suspended before calling this method.
  * Returns the previous table, which the caller must destroy.
  */
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03  8:18 [PATCHv4 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
                   ` (2 preceding siblings ...)
  2014-02-03  8:18 ` [PATCH 3/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
@ 2014-02-03  8:18 ` Hannes Reinecke
  2014-02-03 11:30   ` Junichi Nomura
  2014-02-03 12:08   ` Junichi Nomura
  2014-02-03  8:18 ` [PATCH 5/6] dm-multipath: reduce memory pressure during requeuing Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 6/6] dm-multipath: remove map_io() Hannes Reinecke
  5 siblings, 2 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03  8:18 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: Jun'ichi Nomura, dm-devel, Mike Snitzer

Doesn't serve any real purpose anymore; dm_table_run_queue()
will already move things to a workqueue, so we don't need
to do it ourselves.
We only need to take care to add a small delay when calling
__pg_init_all_paths() to move processing off to a workqueue;
pg_init_done() is run from an interrupt context and needs to
complete as fast as possible.

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-mpath.c | 59 +++++++++++++++------------------------------------
 1 file changed, 17 insertions(+), 42 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 5373ca9..b11e3b3 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -93,8 +93,6 @@ struct multipath {
 	unsigned pg_init_count;		/* Number of times pg_init called */
 	unsigned pg_init_delay_msecs;	/* Number of msecs before pg_init retry */
 
-	struct work_struct process_queued_ios;
-
 	struct work_struct trigger_event;
 
 	/*
@@ -119,7 +117,6 @@ typedef int (*action_fn) (struct pgpath *pgpath);
 static struct kmem_cache *_mpio_cache;
 
 static struct workqueue_struct *kmultipathd, *kmpath_handlerd;
-static void process_queued_ios(struct work_struct *work);
 static void trigger_event(struct work_struct *work);
 static void activate_path(struct work_struct *work);
 static int __pgpath_busy(struct pgpath *pgpath);
@@ -197,7 +194,6 @@ static struct multipath *alloc_multipath(struct dm_target *ti)
 		spin_lock_init(&m->lock);
 		m->queue_io = 1;
 		m->pg_init_delay_msecs = DM_PG_INIT_DELAY_DEFAULT;
-		INIT_WORK(&m->process_queued_ios, process_queued_ios);
 		INIT_WORK(&m->trigger_event, trigger_event);
 		init_waitqueue_head(&m->pg_init_wait);
 		mutex_init(&m->work_mutex);
@@ -254,10 +250,10 @@ static void clear_mapinfo(struct multipath *m, union map_info *info)
  * Path selection
  *-----------------------------------------------*/
 
-static void __pg_init_all_paths(struct multipath *m)
+static void __pg_init_all_paths(struct multipath *m, unsigned long min_delay)
 {
 	struct pgpath *pgpath;
-	unsigned long pg_init_delay = 0;
+	unsigned long pg_init_delay = min_delay;
 
 	if (m->pg_init_in_progress || m->pg_init_disabled)
 		return;
@@ -406,7 +402,7 @@ static int map_io(struct multipath *m, struct request *clone,
 							      &pgpath->path,
 							      nr_bytes);
 		} else {
-			__pg_init_all_paths(m);
+			__pg_init_all_paths(m, 0);
 			r = DM_MAPIO_REQUEUE;
 		}
 	} else {
@@ -438,41 +434,13 @@ static int queue_if_no_path(struct multipath *m, unsigned queue_if_no_path,
 		m->saved_queue_if_no_path = queue_if_no_path;
 	m->queue_if_no_path = queue_if_no_path;
 	if (!m->queue_if_no_path)
-		queue_work(kmultipathd, &m->process_queued_ios);
+		dm_table_run_md_queue_async(m->ti->table);
 
 	spin_unlock_irqrestore(&m->lock, flags);
 
 	return 0;
 }
 
-static void process_queued_ios(struct work_struct *work)
-{
-	struct multipath *m =
-		container_of(work, struct multipath, process_queued_ios);
-	struct pgpath *pgpath = NULL;
-	unsigned must_queue = 1;
-	unsigned long flags;
-
-	spin_lock_irqsave(&m->lock, flags);
-
-	if (!m->current_pgpath)
-		__choose_pgpath(m, 0);
-
-	pgpath = m->current_pgpath;
-
-	if ((pgpath && !m->queue_io) ||
-	    (!pgpath && !m->queue_if_no_path))
-		must_queue = 0;
-
-	if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
-	    !m->pg_init_disabled)
-		__pg_init_all_paths(m);
-
-	spin_unlock_irqrestore(&m->lock, flags);
-	if (!must_queue)
-		dm_table_run_md_queue_async(m->ti->table);
-}
-
 /*
  * An event is triggered whenever a path is taken out of use.
  * Includes path failure and PG bypass.
@@ -1019,7 +987,7 @@ static int reinstate_path(struct pgpath *pgpath)
 
 	if (!m->nr_valid_paths++) {
 		m->current_pgpath = NULL;
-		queue_work(kmultipathd, &m->process_queued_ios);
+		dm_table_run_md_queue_async(m->ti->table);
 	} else if (m->hw_handler_name && (m->current_pg == pgpath->pg)) {
 		if (queue_work(kmpath_handlerd, &pgpath->activate_path.work))
 			m->pg_init_in_progress++;
@@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
 
 	if (!m->pg_init_required)
 		m->queue_io = 0;
-
-	m->pg_init_delay_retry = delay_retry;
-	queue_work(kmultipathd, &m->process_queued_ios);
+	else {
+		m->pg_init_delay_retry = delay_retry;
+		__pg_init_all_paths(m, 50/HZ);
+		goto out;
+	}
 
 	/*
 	 * Wake up any thread waiting to suspend.
@@ -1593,8 +1563,13 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
 	if (!r && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT)
 		r = scsi_verify_blk_ioctl(NULL, cmd);
 
-	if (r == -ENOTCONN && !fatal_signal_pending(current))
-		queue_work(kmultipathd, &m->process_queued_ios);
+	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
+		spin_lock_irqsave(&m->lock, flags);
+		if (m->current_pgpath && m->pg_init_required)
+			__pg_init_all_paths(m, 0);
+		spin_unlock_irqrestore(&m->lock, flags);
+		dm_table_run_md_queue_async(m->ti->table);
+	}
 
 	return r ? : __blkdev_driver_ioctl(bdev, mode, cmd, arg);
 }
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/6] dm-multipath: reduce memory pressure during requeuing
  2014-02-03  8:18 [PATCHv4 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
                   ` (3 preceding siblings ...)
  2014-02-03  8:18 ` [PATCH 4/6] dm-multipath: remove process_queued_ios() Hannes Reinecke
@ 2014-02-03  8:18 ` Hannes Reinecke
  2014-02-03  8:18 ` [PATCH 6/6] dm-multipath: remove map_io() Hannes Reinecke
  5 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03  8:18 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: Jun'ichi Nomura, dm-devel, Mike Snitzer

When multipath needs to requeue I/O in the block layer
the per-request context shouldn't be allocated, as it will
be freed immediately afterwards anyway.
Avoiding this memory allocation will reduce memory
pressure during requeuing.

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-mpath.c | 40 ++++++++++++++++------------------------
 1 file changed, 16 insertions(+), 24 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index b11e3b3..daca739 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -372,12 +372,12 @@ static int __must_push_back(struct multipath *m)
 static int map_io(struct multipath *m, struct request *clone,
 		  union map_info *map_context)
 {
-	int r = DM_MAPIO_REMAPPED;
+	int r = DM_MAPIO_REQUEUE;
 	size_t nr_bytes = blk_rq_bytes(clone);
 	unsigned long flags;
 	struct pgpath *pgpath;
 	struct block_device *bdev;
-	struct dm_mpath_io *mpio = map_context->ptr;
+	struct dm_mpath_io *mpio;
 
 	spin_lock_irqsave(&m->lock, flags);
 
@@ -390,29 +390,31 @@ static int map_io(struct multipath *m, struct request *clone,
 
 	if (pgpath) {
 		if (__pgpath_busy(pgpath))
-			r = DM_MAPIO_REQUEUE;
-		else if (pg_ready(m)) {
+			goto out_unlock;
+
+		if (pg_ready(m)) {
+			if (set_mapinfo(m, map_context) < 0)
+				goto out_unlock;
+
 			bdev = pgpath->path.dev->bdev;
 			clone->q = bdev_get_queue(bdev);
 			clone->rq_disk = bdev->bd_disk;
+			clone->cmd_flags |= REQ_FAILFAST_TRANSPORT;
+			mpio = map_context->ptr;
 			mpio->pgpath = pgpath;
 			mpio->nr_bytes = nr_bytes;
 			if (pgpath->pg->ps.type->start_io)
 				pgpath->pg->ps.type->start_io(&pgpath->pg->ps,
 							      &pgpath->path,
 							      nr_bytes);
-		} else {
-			__pg_init_all_paths(m, 0);
-			r = DM_MAPIO_REQUEUE;
+			r = DM_MAPIO_REMAPPED;
+			goto out_unlock;
 		}
-	} else {
-		/* No path */
-		if (__must_push_back(m))
-			r = DM_MAPIO_REQUEUE;
-		else
+		__pg_init_all_paths(m, 0);
+	} else if (!__must_push_back(m))
 			r = -EIO;	/* Failed */
-	}
 
+out_unlock:
 	spin_unlock_irqrestore(&m->lock, flags);
 
 	return r;
@@ -908,19 +910,9 @@ static void multipath_dtr(struct dm_target *ti)
 static int multipath_map(struct dm_target *ti, struct request *clone,
 			 union map_info *map_context)
 {
-	int r;
 	struct multipath *m = (struct multipath *) ti->private;
 
-	if (set_mapinfo(m, map_context) < 0)
-		/* ENOMEM, requeue */
-		return DM_MAPIO_REQUEUE;
-
-	clone->cmd_flags |= REQ_FAILFAST_TRANSPORT;
-	r = map_io(m, clone, map_context);
-	if (r < 0 || r == DM_MAPIO_REQUEUE)
-		clear_mapinfo(m, map_context);
-
-	return r;
+	return map_io(m, clone, map_context);
 }
 
 /*
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/6] dm-multipath: remove map_io()
  2014-02-03  8:18 [PATCHv4 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
                   ` (4 preceding siblings ...)
  2014-02-03  8:18 ` [PATCH 5/6] dm-multipath: reduce memory pressure during requeuing Hannes Reinecke
@ 2014-02-03  8:18 ` Hannes Reinecke
  5 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03  8:18 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: Jun'ichi Nomura, dm-devel, Mike Snitzer

multipath_map() is now just a wrapper around map_io(), so we
can fold both into multipath_map().

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-mpath.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index daca739..d5b512e 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -369,9 +369,13 @@ static int __must_push_back(struct multipath *m)
 
 #define pg_ready(m) (!(m)->queue_io && !(m)->pg_init_required)
 
-static int map_io(struct multipath *m, struct request *clone,
-		  union map_info *map_context)
+/*
+ * Map cloned requests
+ */
+static int multipath_map(struct dm_target *ti, struct request *clone,
+			 union map_info *map_context)
 {
+	struct multipath *m = (struct multipath *) ti->private;
 	int r = DM_MAPIO_REQUEUE;
 	size_t nr_bytes = blk_rq_bytes(clone);
 	unsigned long flags;
@@ -905,17 +909,6 @@ static void multipath_dtr(struct dm_target *ti)
 }
 
 /*
- * Map cloned requests
- */
-static int multipath_map(struct dm_target *ti, struct request *clone,
-			 union map_info *map_context)
-{
-	struct multipath *m = (struct multipath *) ti->private;
-
-	return map_io(m, clone, map_context);
-}
-
-/*
  * Take a path out of use.
  */
 static int fail_path(struct pgpath *pgpath)
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03  8:18 ` [PATCH 4/6] dm-multipath: remove process_queued_ios() Hannes Reinecke
@ 2014-02-03 11:30   ` Junichi Nomura
  2014-02-03 11:39     ` Hannes Reinecke
  2014-02-03 12:08   ` Junichi Nomura
  1 sibling, 1 reply; 15+ messages in thread
From: Junichi Nomura @ 2014-02-03 11:30 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: dm-devel@redhat.com, Mike Snitzer, Alasdair Kergon

On 02/03/14 17:18, Hannes Reinecke wrote:
> Doesn't serve any real purpose anymore; dm_table_run_queue()
> will already move things to a workqueue, so we don't need
> to do it ourselves.
> We only need to take care to add a small delay when calling
> __pg_init_all_paths() to move processing off to a workqueue;
> pg_init_done() is run from an interrupt context and needs to
> complete as fast as possible.

I think more explanation is needed for the patch description.
As far as I understand, the change is based on the following reasoning:

  process_queued_ios() has served 3 functions:
    1) select pg and pgpath if none is selected
    2) start pg_init if requested
    3) dispatch queued IOs when pg is ready

  Basically, a call to queue_work(process_queued_ios) can be replaced by
  dm_table_run_queue(), which runs request queue and ends up calling
  map_io(), which does 1), 2) and 3).

  Exception is when !pg_ready() (= either pg_init is running or requested),
  multipath_busy() prevents map_io() being called from request_fn.

  If pg_init is running, it should be ok as far as pg_init_done() does
  the right thing when pg_init is completed. I.e. restart pg_init if
  !pg_ready() or call dm_table_run_queue() to kick map_io().

  If pg_init is requested, we have to make sure the request is detected
  and pg_init will be started.
  pg_init is requested in 3 places:
    a) __choose_pgpath() in map_io()
    b) __choose_pgpath() in multipath_ioctl()
    c) pg_init retry in pg_init_done()
  a) is ok because map_io() calls __pg_init_all_paths(), which does 2).
  b) needs a call to __pg_init_all_paths(), which does 2).
  c) needs a call to __pg_init_all_paths(), which does 2).


By writing the above, I found possible bugs related to 1):

> @@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
>  
>  	if (!m->pg_init_required)
>  		m->queue_io = 0;
> -
> -	m->pg_init_delay_retry = delay_retry;
> -	queue_work(kmultipathd, &m->process_queued_ios);
> +	else {
> +		m->pg_init_delay_retry = delay_retry;
> +		__pg_init_all_paths(m, 50/HZ);
> +		goto out;
> +	}
>  
>  	/*
>  	 * Wake up any thread waiting to suspend.

It is possible that m->current_pg is NULL.
(E.g. pg_init failed for current_pgpath, bypass_pg() was called, etc.)
__pg_init_all_paths() will cause oops in such a case.

So how about doing this in pg_init_done():

	if (m->pg_init_required) {
		m->pg_init_delay_retry = delay_retry;
		if (__pg_init_all_paths(m))
			goto out;
	}

	/* pg_init successfully completed */
	m->queue_io = 0;

and in __pg_init_all_paths(), do something like:

	m->pg_init_required = 0;
	...
	if (!m->current_pg)
		return 0;
	...
	return m->pg_init_in_progress;


> @@ -1593,8 +1563,13 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
>  	if (!r && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT)
>  		r = scsi_verify_blk_ioctl(NULL, cmd);
>  
> -	if (r == -ENOTCONN && !fatal_signal_pending(current))
> -		queue_work(kmultipathd, &m->process_queued_ios);
> +	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
> +		spin_lock_irqsave(&m->lock, flags);
> +		if (m->current_pgpath && m->pg_init_required)
> +			__pg_init_all_paths(m, 0);
> +		spin_unlock_irqrestore(&m->lock, flags);
> +		dm_table_run_md_queue_async(m->ti->table);
> +	}
>  
>  	return r ? : __blkdev_driver_ioctl(bdev, mode, cmd, arg);
>  }

Similarly, m->current_pgpath can be NULL here while pg_init_required.
Then pg_init_required is left uncleared and all IOs in the queue will
stall until somebody calls multipath_ioctl() to redo pg selection.

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03 11:30   ` Junichi Nomura
@ 2014-02-03 11:39     ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03 11:39 UTC (permalink / raw)
  To: Junichi Nomura; +Cc: dm-devel@redhat.com, Mike Snitzer, Alasdair Kergon

On 02/03/2014 12:30 PM, Junichi Nomura wrote:
> On 02/03/14 17:18, Hannes Reinecke wrote:
>> Doesn't serve any real purpose anymore; dm_table_run_queue()
>> will already move things to a workqueue, so we don't need
>> to do it ourselves.
>> We only need to take care to add a small delay when calling
>> __pg_init_all_paths() to move processing off to a workqueue;
>> pg_init_done() is run from an interrupt context and needs to
>> complete as fast as possible.
> 
> I think more explanation is needed for the patch description.
> As far as I understand, the change is based on the following reasoning:
> 
>   process_queued_ios() has served 3 functions:
>     1) select pg and pgpath if none is selected
>     2) start pg_init if requested
>     3) dispatch queued IOs when pg is ready
> 
>   Basically, a call to queue_work(process_queued_ios) can be replaced by
>   dm_table_run_queue(), which runs request queue and ends up calling
>   map_io(), which does 1), 2) and 3).
> 
Yes.

>   Exception is when !pg_ready() (= either pg_init is running or requested),
>   multipath_busy() prevents map_io() being called from request_fn.
> 
>   If pg_init is running, it should be ok as far as pg_init_done() does
>   the right thing when pg_init is completed. I.e. restart pg_init if
>   !pg_ready() or call dm_table_run_queue() to kick map_io().
> 
>   If pg_init is requested, we have to make sure the request is detected
>   and pg_init will be started.
>   pg_init is requested in 3 places:
>     a) __choose_pgpath() in map_io()
>     b) __choose_pgpath() in multipath_ioctl()
>     c) pg_init retry in pg_init_done()
>   a) is ok because map_io() calls __pg_init_all_paths(), which does 2).
>   b) needs a call to __pg_init_all_paths(), which does 2).
>   c) needs a call to __pg_init_all_paths(), which does 2).
> 
> 
> By writing the above, I found possible bugs related to 1):
> 
>> @@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
>>  
>>  	if (!m->pg_init_required)
>>  		m->queue_io = 0;
>> -
>> -	m->pg_init_delay_retry = delay_retry;
>> -	queue_work(kmultipathd, &m->process_queued_ios);
>> +	else {
>> +		m->pg_init_delay_retry = delay_retry;
>> +		__pg_init_all_paths(m, 50/HZ);
>> +		goto out;
>> +	}
>>  
>>  	/*
>>  	 * Wake up any thread waiting to suspend.
> 
> It is possible that m->current_pg is NULL.
> (E.g. pg_init failed for current_pgpath, bypass_pg() was called, etc.)
> __pg_init_all_paths() will cause oops in such a case.
> 
Ah. Right.

> So how about doing this in pg_init_done():
> 
> 	if (m->pg_init_required) {
> 		m->pg_init_delay_retry = delay_retry;
> 		if (__pg_init_all_paths(m))
> 			goto out;
> 	}
> 
> 	/* pg_init successfully completed */
> 	m->queue_io = 0;
> 
> and in __pg_init_all_paths(), do something like:
> 
> 	m->pg_init_required = 0;
> 	...
> 	if (!m->current_pg)
> 		return 0;
> 	...
> 	return m->pg_init_in_progress;
> 
> 
Hmm. That still wouldn't be doing the right thing.
'fail_path' in pg_init_done() might be setting current_pg to NULL,
but this doesn't mean that the entire path group is invalid.
I just means that this particular path is invalid, and we still
might need to retry pg_init for the other paths.

>> @@ -1593,8 +1563,13 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
>>  	if (!r && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT)
>>  		r = scsi_verify_blk_ioctl(NULL, cmd);
>>  
>> -	if (r == -ENOTCONN && !fatal_signal_pending(current))
>> -		queue_work(kmultipathd, &m->process_queued_ios);
>> +	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
>> +		spin_lock_irqsave(&m->lock, flags);
>> +		if (m->current_pgpath && m->pg_init_required)
>> +			__pg_init_all_paths(m, 0);
>> +		spin_unlock_irqrestore(&m->lock, flags);
>> +		dm_table_run_md_queue_async(m->ti->table);
>> +	}
>>  
>>  	return r ? : __blkdev_driver_ioctl(bdev, mode, cmd, arg);
>>  }
> 
> Similarly, m->current_pgpath can be NULL here while pg_init_required.
> Then pg_init_required is left uncleared and all IOs in the queue will
> stall until somebody calls multipath_ioctl() to redo pg selection.
> 
Ok, correct. Will be fixing it up.

Thanks for the review.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03  8:18 ` [PATCH 4/6] dm-multipath: remove process_queued_ios() Hannes Reinecke
  2014-02-03 11:30   ` Junichi Nomura
@ 2014-02-03 12:08   ` Junichi Nomura
  2014-02-03 12:18     ` Hannes Reinecke
  1 sibling, 1 reply; 15+ messages in thread
From: Junichi Nomura @ 2014-02-03 12:08 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: device-mapper development, Mike Snitzer, Alasdair Kergon

On 02/03/14 17:18, Hannes Reinecke wrote:
> We only need to take care to add a small delay when calling
> __pg_init_all_paths() to move processing off to a workqueue;
> pg_init_done() is run from an interrupt context and needs to
> complete as fast as possible.
...
> @@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
>  
>  	if (!m->pg_init_required)
>  		m->queue_io = 0;
> -
> -	m->pg_init_delay_retry = delay_retry;
> -	queue_work(kmultipathd, &m->process_queued_ios);
> +	else {
> +		m->pg_init_delay_retry = delay_retry;
> +		__pg_init_all_paths(m, 50/HZ);
> +		goto out;
> +	}
>  

I forgot to comment on this.
Adding delay to queue_work() doesn't make it fast.
So I couldn't see why this "50/HZ" delay has to be added
and where this value comes.

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03 12:08   ` Junichi Nomura
@ 2014-02-03 12:18     ` Hannes Reinecke
  2014-02-03 12:39       ` Junichi Nomura
  0 siblings, 1 reply; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03 12:18 UTC (permalink / raw)
  To: Junichi Nomura; +Cc: device-mapper development, Mike Snitzer, Alasdair Kergon

On 02/03/2014 01:08 PM, Junichi Nomura wrote:
> On 02/03/14 17:18, Hannes Reinecke wrote:
>> We only need to take care to add a small delay when calling
>> __pg_init_all_paths() to move processing off to a workqueue;
>> pg_init_done() is run from an interrupt context and needs to
>> complete as fast as possible.
> ...
>> @@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
>>  
>>  	if (!m->pg_init_required)
>>  		m->queue_io = 0;
>> -
>> -	m->pg_init_delay_retry = delay_retry;
>> -	queue_work(kmultipathd, &m->process_queued_ios);
>> +	else {
>> +		m->pg_init_delay_retry = delay_retry;
>> +		__pg_init_all_paths(m, 50/HZ);
>> +		goto out;
>> +	}
>>  
> 
> I forgot to comment on this.
> Adding delay to queue_work() doesn't make it fast.
> So I couldn't see why this "50/HZ" delay has to be added
> and where this value comes.
> 
Well, it wasn't probably the best choice of words.

Thing is, without a delay the workqueue item will be executed
directly (cf __queue_delayed_work()).
But pg_init_done() is run from an interrupt context, and as such any
memory allocations have to be atomic.
So if we were to call queue_delayed_work() without any delay
we will end up calling scsi_dh_activate from an interrupt context,
too, but there we most definitely do _not_ have only atomic allocations.
Hence the delay.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03 12:34 [PATCHv5 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
@ 2014-02-03 12:34 ` Hannes Reinecke
  0 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03 12:34 UTC (permalink / raw)
  To: Alasdair Kergon; +Cc: Jun'ichi Nomura, dm-devel, Mike Snitzer

process_queued_ios() has served 3 functions:
  1) select pg and pgpath if none is selected
  2) start pg_init if requested
  3) dispatch queued IOs when pg is ready

Basically, a call to queue_work(process_queued_ios) can be
replaced by dm_table_run_queue(), which runs request queue
and ends up calling map_io(), which does 1), 2) and 3).

Exception is when !pg_ready() (= either pg_init is running
or requested), then multipath_busy() prevents map_io() being
called from request_fn.

If pg_init is running, it should be ok as far as pg_init_done() does
the right thing when pg_init is completed. I.e. restart pg_init if
!pg_ready() or call dm_table_run_queue() to kick map_io().

If pg_init is requested, we have to make sure the request is detected
and pg_init will be started.
pg_init is requested in 3 places:
  a) __choose_pgpath() in map_io()
  b) __choose_pgpath() in multipath_ioctl()
  c) pg_init retry in pg_init_done()
a) is ok because map_io() calls __pg_init_all_paths(), which does 2).
b) needs a call to __pg_init_all_paths(), which does 2).
c) needs a call to __pg_init_all_paths(), which does 2).

So this patch removes process_queued_ios() and ensures that
__pg_init_all_paths() is called at the appropriate locations.

We only need to take care to add a small delay when calling
__pg_init_all_paths() to move processing off to a workqueue;
pg_init_done() might end up calling scsi_dh_activate() directly,
which might use non-atomic memory allocations.
Not to speak of issuing I/O, too.

Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/md/dm-mpath.c | 63 +++++++++++++++++----------------------------------
 1 file changed, 21 insertions(+), 42 deletions(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 5373ca9..ff3bf3d 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -93,8 +93,6 @@ struct multipath {
 	unsigned pg_init_count;		/* Number of times pg_init called */
 	unsigned pg_init_delay_msecs;	/* Number of msecs before pg_init retry */
 
-	struct work_struct process_queued_ios;
-
 	struct work_struct trigger_event;
 
 	/*
@@ -119,7 +117,6 @@ typedef int (*action_fn) (struct pgpath *pgpath);
 static struct kmem_cache *_mpio_cache;
 
 static struct workqueue_struct *kmultipathd, *kmpath_handlerd;
-static void process_queued_ios(struct work_struct *work);
 static void trigger_event(struct work_struct *work);
 static void activate_path(struct work_struct *work);
 static int __pgpath_busy(struct pgpath *pgpath);
@@ -197,7 +194,6 @@ static struct multipath *alloc_multipath(struct dm_target *ti)
 		spin_lock_init(&m->lock);
 		m->queue_io = 1;
 		m->pg_init_delay_msecs = DM_PG_INIT_DELAY_DEFAULT;
-		INIT_WORK(&m->process_queued_ios, process_queued_ios);
 		INIT_WORK(&m->trigger_event, trigger_event);
 		init_waitqueue_head(&m->pg_init_wait);
 		mutex_init(&m->work_mutex);
@@ -254,10 +250,10 @@ static void clear_mapinfo(struct multipath *m, union map_info *info)
  * Path selection
  *-----------------------------------------------*/
 
-static void __pg_init_all_paths(struct multipath *m)
+static void __pg_init_all_paths(struct multipath *m, unsigned long min_delay)
 {
 	struct pgpath *pgpath;
-	unsigned long pg_init_delay = 0;
+	unsigned long pg_init_delay = min_delay;
 
 	if (m->pg_init_in_progress || m->pg_init_disabled)
 		return;
@@ -406,7 +402,7 @@ static int map_io(struct multipath *m, struct request *clone,
 							      &pgpath->path,
 							      nr_bytes);
 		} else {
-			__pg_init_all_paths(m);
+			__pg_init_all_paths(m, 0);
 			r = DM_MAPIO_REQUEUE;
 		}
 	} else {
@@ -438,41 +434,13 @@ static int queue_if_no_path(struct multipath *m, unsigned queue_if_no_path,
 		m->saved_queue_if_no_path = queue_if_no_path;
 	m->queue_if_no_path = queue_if_no_path;
 	if (!m->queue_if_no_path)
-		queue_work(kmultipathd, &m->process_queued_ios);
+		dm_table_run_md_queue_async(m->ti->table);
 
 	spin_unlock_irqrestore(&m->lock, flags);
 
 	return 0;
 }
 
-static void process_queued_ios(struct work_struct *work)
-{
-	struct multipath *m =
-		container_of(work, struct multipath, process_queued_ios);
-	struct pgpath *pgpath = NULL;
-	unsigned must_queue = 1;
-	unsigned long flags;
-
-	spin_lock_irqsave(&m->lock, flags);
-
-	if (!m->current_pgpath)
-		__choose_pgpath(m, 0);
-
-	pgpath = m->current_pgpath;
-
-	if ((pgpath && !m->queue_io) ||
-	    (!pgpath && !m->queue_if_no_path))
-		must_queue = 0;
-
-	if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
-	    !m->pg_init_disabled)
-		__pg_init_all_paths(m);
-
-	spin_unlock_irqrestore(&m->lock, flags);
-	if (!must_queue)
-		dm_table_run_md_queue_async(m->ti->table);
-}
-
 /*
  * An event is triggered whenever a path is taken out of use.
  * Includes path failure and PG bypass.
@@ -1019,7 +987,7 @@ static int reinstate_path(struct pgpath *pgpath)
 
 	if (!m->nr_valid_paths++) {
 		m->current_pgpath = NULL;
-		queue_work(kmultipathd, &m->process_queued_ios);
+		dm_table_run_md_queue_async(m->ti->table);
 	} else if (m->hw_handler_name && (m->current_pg == pgpath->pg)) {
 		if (queue_work(kmpath_handlerd, &pgpath->activate_path.work))
 			m->pg_init_in_progress++;
@@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
 
 	if (!m->pg_init_required)
 		m->queue_io = 0;
-
-	m->pg_init_delay_retry = delay_retry;
-	queue_work(kmultipathd, &m->process_queued_ios);
+	else if (m->current_pg) {
+		m->pg_init_delay_retry = delay_retry;
+		__pg_init_all_paths(m, 50/HZ);
+		goto out;
+	}
 
 	/*
 	 * Wake up any thread waiting to suspend.
@@ -1593,8 +1563,17 @@ static int multipath_ioctl(struct dm_target *ti, unsigned int cmd,
 	if (!r && ti->len != i_size_read(bdev->bd_inode) >> SECTOR_SHIFT)
 		r = scsi_verify_blk_ioctl(NULL, cmd);
 
-	if (r == -ENOTCONN && !fatal_signal_pending(current))
-		queue_work(kmultipathd, &m->process_queued_ios);
+	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
+		spin_lock_irqsave(&m->lock, flags);
+		if (!m->current_pg) {
+			/* Path status changed, redo selection */
+			__choose_pgpath(m, 0);
+		}
+		if (m->current_pg && m->pg_init_required)
+			__pg_init_all_paths(m, 0);
+		spin_unlock_irqrestore(&m->lock, flags);
+		dm_table_run_md_queue_async(m->ti->table);
+	}
 
 	return r ? : __blkdev_driver_ioctl(bdev, mode, cmd, arg);
 }
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03 12:18     ` Hannes Reinecke
@ 2014-02-03 12:39       ` Junichi Nomura
  2014-02-03 12:57         ` Hannes Reinecke
  0 siblings, 1 reply; 15+ messages in thread
From: Junichi Nomura @ 2014-02-03 12:39 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: device-mapper development, Mike Snitzer, Alasdair Kergon

On 02/03/14 21:18, Hannes Reinecke wrote:
> On 02/03/2014 01:08 PM, Junichi Nomura wrote:
>> On 02/03/14 17:18, Hannes Reinecke wrote:
>>> We only need to take care to add a small delay when calling
>>> __pg_init_all_paths() to move processing off to a workqueue;
>>> pg_init_done() is run from an interrupt context and needs to
>>> complete as fast as possible.
>> ...
>>> @@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
>>>  
>>>  	if (!m->pg_init_required)
>>>  		m->queue_io = 0;
>>> -
>>> -	m->pg_init_delay_retry = delay_retry;
>>> -	queue_work(kmultipathd, &m->process_queued_ios);
>>> +	else {
>>> +		m->pg_init_delay_retry = delay_retry;
>>> +		__pg_init_all_paths(m, 50/HZ);
>>> +		goto out;
>>> +	}
>>>  
>>
>> I forgot to comment on this.
>> Adding delay to queue_work() doesn't make it fast.
>> So I couldn't see why this "50/HZ" delay has to be added
>> and where this value comes.
>>
> Well, it wasn't probably the best choice of words.
> 
> Thing is, without a delay the workqueue item will be executed
> directly (cf __queue_delayed_work()).
> But pg_init_done() is run from an interrupt context, and as such any
> memory allocations have to be atomic.
> So if we were to call queue_delayed_work() without any delay
> we will end up calling scsi_dh_activate from an interrupt context,
> too, but there we most definitely do _not_ have only atomic allocations.
> Hence the delay.

Work is executed in the worker context (in this case by kmpath_handlerd).
Isn't it?

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03 12:39       ` Junichi Nomura
@ 2014-02-03 12:57         ` Hannes Reinecke
  2014-02-04  3:21           ` Junichi Nomura
  0 siblings, 1 reply; 15+ messages in thread
From: Hannes Reinecke @ 2014-02-03 12:57 UTC (permalink / raw)
  To: Junichi Nomura; +Cc: device-mapper development, Mike Snitzer, Alasdair Kergon

On 02/03/2014 01:39 PM, Junichi Nomura wrote:
> On 02/03/14 21:18, Hannes Reinecke wrote:
>> On 02/03/2014 01:08 PM, Junichi Nomura wrote:
>>> On 02/03/14 17:18, Hannes Reinecke wrote:
>>>> We only need to take care to add a small delay when calling
>>>> __pg_init_all_paths() to move processing off to a workqueue;
>>>> pg_init_done() is run from an interrupt context and needs to
>>>> complete as fast as possible.
>>> ...
>>>> @@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
>>>>  
>>>>  	if (!m->pg_init_required)
>>>>  		m->queue_io = 0;
>>>> -
>>>> -	m->pg_init_delay_retry = delay_retry;
>>>> -	queue_work(kmultipathd, &m->process_queued_ios);
>>>> +	else {
>>>> +		m->pg_init_delay_retry = delay_retry;
>>>> +		__pg_init_all_paths(m, 50/HZ);
>>>> +		goto out;
>>>> +	}
>>>>  
>>>
>>> I forgot to comment on this.
>>> Adding delay to queue_work() doesn't make it fast.
>>> So I couldn't see why this "50/HZ" delay has to be added
>>> and where this value comes.
>>>
>> Well, it wasn't probably the best choice of words.
>>
>> Thing is, without a delay the workqueue item will be executed
>> directly (cf __queue_delayed_work()).
>> But pg_init_done() is run from an interrupt context, and as such any
>> memory allocations have to be atomic.
>> So if we were to call queue_delayed_work() without any delay
>> we will end up calling scsi_dh_activate from an interrupt context,
>> too, but there we most definitely do _not_ have only atomic allocations.
>> Hence the delay.
> 
> Work is executed in the worker context (in this case by kmpath_handlerd).
> Isn't it?
> 
Yes, but without the delay we'd be scheduling during pg_init_done(),
ie within an interrupt context.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] dm-multipath: remove process_queued_ios()
  2014-02-03 12:57         ` Hannes Reinecke
@ 2014-02-04  3:21           ` Junichi Nomura
  0 siblings, 0 replies; 15+ messages in thread
From: Junichi Nomura @ 2014-02-04  3:21 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: device-mapper development, Mike Snitzer, Alasdair Kergon

On 02/03/14 21:57, Hannes Reinecke wrote:
> On 02/03/2014 01:39 PM, Junichi Nomura wrote:
>> On 02/03/14 21:18, Hannes Reinecke wrote:
>>> On 02/03/2014 01:08 PM, Junichi Nomura wrote:
>>>> On 02/03/14 17:18, Hannes Reinecke wrote:
>>>>> We only need to take care to add a small delay when calling
>>>>> __pg_init_all_paths() to move processing off to a workqueue;
>>>>> pg_init_done() is run from an interrupt context and needs to
>>>>> complete as fast as possible.
>>>> ...
>>>>> @@ -1217,9 +1185,11 @@ static void pg_init_done(void *data, int errors)
>>>>>  
>>>>>  	if (!m->pg_init_required)
>>>>>  		m->queue_io = 0;
>>>>> -
>>>>> -	m->pg_init_delay_retry = delay_retry;
>>>>> -	queue_work(kmultipathd, &m->process_queued_ios);
>>>>> +	else {
>>>>> +		m->pg_init_delay_retry = delay_retry;
>>>>> +		__pg_init_all_paths(m, 50/HZ);
>>>>> +		goto out;
>>>>> +	}
>>>>>  
>>>>
>>>> I forgot to comment on this.
>>>> Adding delay to queue_work() doesn't make it fast.
>>>> So I couldn't see why this "50/HZ" delay has to be added
>>>> and where this value comes.
>>>>
>>> Well, it wasn't probably the best choice of words.
>>>
>>> Thing is, without a delay the workqueue item will be executed
>>> directly (cf __queue_delayed_work()).
>>> But pg_init_done() is run from an interrupt context, and as such any
>>> memory allocations have to be atomic.
>>> So if we were to call queue_delayed_work() without any delay
>>> we will end up calling scsi_dh_activate from an interrupt context,
>>> too, but there we most definitely do _not_ have only atomic allocations.
>>> Hence the delay.
>>
>> Work is executed in the worker context (in this case by kmpath_handlerd).
>> Isn't it?
>>
> Yes, but without the delay we'd be scheduling during pg_init_done(),
> ie within an interrupt context.

Could you elaborate on the problem you are going to solve?
If scheduling happens in interrupt context, it's a bug.
And if such a bug exists, it should be there even without this series
of your patch.

Besides, 50/HZ is 0 unless your HZ is extremely low.
So the code won't work as you intended anyway...

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-02-04  3:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-03  8:18 [PATCHv4 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
2014-02-03  8:18 ` [PATCH 1/6] dm-multipath: Do not call pg_init twice Hannes Reinecke
2014-02-03  8:18 ` [PATCH 2/6] dm: implement dm_md_get_queue() Hannes Reinecke
2014-02-03  8:18 ` [PATCH 3/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
2014-02-03  8:18 ` [PATCH 4/6] dm-multipath: remove process_queued_ios() Hannes Reinecke
2014-02-03 11:30   ` Junichi Nomura
2014-02-03 11:39     ` Hannes Reinecke
2014-02-03 12:08   ` Junichi Nomura
2014-02-03 12:18     ` Hannes Reinecke
2014-02-03 12:39       ` Junichi Nomura
2014-02-03 12:57         ` Hannes Reinecke
2014-02-04  3:21           ` Junichi Nomura
2014-02-03  8:18 ` [PATCH 5/6] dm-multipath: reduce memory pressure during requeuing Hannes Reinecke
2014-02-03  8:18 ` [PATCH 6/6] dm-multipath: remove map_io() Hannes Reinecke
  -- strict thread matches above, loose matches on Subject: below --
2014-02-03 12:34 [PATCHv5 0/6] dm-multipath: push back requests instead of queueing Hannes Reinecke
2014-02-03 12:34 ` [PATCH 4/6] dm-multipath: remove process_queued_ios() Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.