From: Hannes Reinecke <hare@suse.de>
To: dm-devel@redhat.com
Cc: Frank Mayhar <fmayhar@google.com>, Mike Snitzer <snitzer@redhat.com>
Subject: Re: RFC for multipath queue_if_no_path timeout.
Date: Fri, 27 Sep 2013 10:06:14 +0200 [thread overview]
Message-ID: <52453C76.6060403@suse.de> (raw)
In-Reply-To: <524520B0.6010003@suse.de>
[-- Attachment #1: Type: text/plain, Size: 2403 bytes --]
On 09/27/2013 08:07 AM, Hannes Reinecke wrote:
> On 09/27/2013 01:49 AM, Mike Snitzer wrote:
>> On Thu, Sep 26 2013 at 7:22pm -0400,
>> Alasdair G Kergon <agk@redhat.com> wrote:
>>
>>> On Thu, Sep 26, 2013 at 10:47:13AM -0700, Frank Mayhar wrote:
>>>> Launching it from ramdisk won't help, particularly, since it still goes
>>>> through the block layer. The other stuff won't help if a (potentially
>>>> unrelated) bug in the daemon happens to be being tickled at the same
>>>> time, or if some dependency happens to be broken and _that's_ what's
>>>> preventing the daemon from making progress.
>>>
>>> Then put more effort into debugging your daemon so it doesn't have
>>> bugs that make it die? Implement the timeout in a robust independent
>>> daemon if it's other code there that's unreliable?
>>>
>>>> And as far as lvm2 and multipath-tools, yeah, they cope okay in the kind
>>>> of environments most people have, but that's not the kind of environment
>>>> (or scale) we have to deal with.
>>>
>>> In what way are your requirements so different that a locked-into-memory
>>> monitoring daemon cannot implement this timeout?
>>
>> Frank, I had a look at your patch. It leaves a lot to be desired, I was
>> starting to clean it up but ultimately found myself agreeing with
>> Alasdair's original point: that this policy should be implemented in the
>> userspace daemon.
>>
> _Actually_ there is a way how this could be implemented properly:
> implement a blk_timeout function.
>
> Thing is, every request_queue might have a timeout function
> implemented, whose goal is to abort requests which are beyond that
> timeout. EG SCSI uses that for the dev_loss_tmo mechanism.
>
> Multipath what with it being request-based could easily implement
> the same mechanism, namely have to blk_timeout function which would
> just re-arm the timeout in the default case, but abort any queued
> I/O (after a timeout) if all paths are down.
>
> Hmm. I see to draft up a PoC.
>
And indeed, here it is.
Completely untested, just to give you an idea what I was going on
about. Let's see if I can put this to test somewhere...
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
[-- Attachment #2: dm-mpath-no-path-timeout.patch --]
[-- Type: text/x-patch, Size: 4589 bytes --]
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 5adede1..6801ac3 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -444,6 +444,61 @@ static int queue_if_no_path(struct multipath *m, unsigned queue_if_no_path,
return 0;
}
+/*
+ * Block timeout callback, called from the block layer
+ *
+ * request_queue lock is held on entry.
+ *
+ * Return values:
+ * BLK_EH_RESET_TIMER if the request should be left running
+ * BLK_EH_NOT_HANDLED if the request is handled or terminated
+ * by the driver.
+ */
+enum blk_eh_timer_return abort_if_no_path(struct request *req)
+{
+ union map_info *info;
+ struct dm_mpath_io *mpio;
+ struct multipath *m;
+ unsigned long flags;
+ int rc = BLK_EH_RESET_TIMER;
+ int flush_ios = 0;
+
+ info = dm_get_rq_mapinfo(req);
+ if (!info || !info->ptr)
+ return BLK_EH_NOT_HANDLED;
+
+ mpio = info->ptr;
+ m = mpio->pgpath->pg->m;
+ /*
+ * Only abort request if:
+ * - queued_ios is not empty
+ * (protect against races with process_queued_ios)
+ * - queue_io is not set
+ * - no valid paths are found
+ */
+ spin_lock_irqsave(&m->lock, flags);
+ if (!list_empty(&m->queued_ios) &&
+ !m->queue_io &&
+ !m->nr_valid_paths) {
+ list_del_init(&req->queuelist);
+ m->queue_size--;
+ m->queue_if_no_path = 0;
+ if (m->queue_size)
+ flush_ios = 1;
+ rc = BLK_EH_NOT_HANDLED;
+ }
+ spin_unlock_irqrestore(&m->lock, flags);
+
+ if (rc == BLK_EH_NOT_HANDLED) {
+ mempool_free(mpio, m->mpio_pool);
+ dm_kill_unmapped_request(clone, -ETIMEOUT);
+ }
+ if (flush_ios)
+ queue_work(kmultipathd, &m->process_queue_ios);
+
+ return rc;
+}
+
/*-----------------------------------------------------------------
* The multipath daemon is responsible for resubmitting queued ios.
*---------------------------------------------------------------*/
@@ -790,6 +845,7 @@ static int parse_features(struct dm_arg_set *as, struct multipath *m)
{0, 6, "invalid number of feature args"},
{1, 50, "pg_init_retries must be between 1 and 50"},
{0, 60000, "pg_init_delay_msecs must be between 0 and 60000"},
+ {0, 65535, "no_path_timeout must be between 0 and 65535"},
};
r = dm_read_arg_group(_args, as, &argc, &ti->error);
@@ -827,6 +883,13 @@ static int parse_features(struct dm_arg_set *as, struct multipath *m)
continue;
}
+ if (!strcasecmp(arg_name, "no_path_timeout") &&
+ (argc >= 1)) {
+ r = dm_read_arg(_args + 3, as, &m->no_path_timeout, &ti->error);
+ argc--;
+ continue;
+ }
+
ti->error = "Unrecognised multipath feature request";
r = -EINVAL;
} while (argc && !r);
@@ -905,6 +968,12 @@ static int multipath_ctr(struct dm_target *ti, unsigned int argc,
goto bad;
}
+ if (m->no_path_timeout)
+ dm_set_queue_timeout(ti, abort_if_no_path,
+ m->no_path_timeout * HZ);
+ else
+ dm_set_queue_timeout(ti, NULL, 0)
+
ti->num_flush_bios = 1;
ti->num_discard_bios = 1;
ti->num_write_same_bios = 1;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 9e39d2b..26bfad6 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1881,6 +1881,16 @@ static void dm_init_md_queue(struct mapped_device *md)
blk_queue_merge_bvec(md->queue, dm_merge_bvec);
}
+static void dm_set_md_timeout(struct mapped_device *md,
+ rq_timed_out_fn *fn, unsigned int timeout)
+{
+ if (dm_get_md_type(md) != DM_TYPE_REQUEST_BASED)
+ return;
+
+ blk_queue_rq_timed_out(md->queue, fn);
+ blk_queue_rq_timeout(md->queue, timeout);
+}
+
/*
* Allocate and initialise a blank device with a given minor.
*/
@@ -2790,6 +2800,13 @@ int dm_noflush_suspending(struct dm_target *ti)
}
EXPORT_SYMBOL_GPL(dm_noflush_suspending);
+int dm_set_queue_timeout(struct dm_target *ti, rq_timed_out_fn *fn,
+ unsigned int timeout)
+{
+ return dm_set_md_timeout(dm_table_get_md(ti->table), fn, timeout);
+}
+EXPORT_SYMBOL_GPL(dm_set_queue_timeout);
+
struct dm_md_mempools *dm_alloc_md_mempools(unsigned type, unsigned integrity, unsigned per_bio_data_size)
{
struct dm_md_mempools *pools = kzalloc(sizeof(*pools), GFP_KERNEL);
diff --git a/drivers/md/dm.h b/drivers/md/dm.h
index 45b97da..c8df1ef 100644
--- a/drivers/md/dm.h
+++ b/drivers/md/dm.h
@@ -80,6 +80,8 @@ void dm_unlock_md_type(struct mapped_device *md);
void dm_set_md_type(struct mapped_device *md, unsigned type);
unsigned dm_get_md_type(struct mapped_device *md);
struct target_type *dm_get_immutable_target_type(struct mapped_device *md);
+void dm_set_queue_timeout(struct dm_table *t, rq_timed_out_fn *fn,
+ unsigned int timeout);
int dm_setup_md_queue(struct mapped_device *md);
[-- Attachment #3: Type: text/plain, Size: 0 bytes --]
next prev parent reply other threads:[~2013-09-27 8:06 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-26 17:14 RFC for multipath queue_if_no_path timeout Frank Mayhar
2013-09-26 17:24 ` Alasdair G Kergon
2013-09-26 17:31 ` Frank Mayhar
2013-09-26 17:38 ` Alasdair G Kergon
2013-09-26 17:47 ` Frank Mayhar
2013-09-26 17:52 ` Mike Snitzer
2013-09-26 20:36 ` [PATCH 1/1] dm mpath: Add timeout mechanism for queue_if_no_path Frank Mayhar
2013-09-26 23:22 ` RFC for multipath queue_if_no_path timeout Alasdair G Kergon
2013-09-26 23:49 ` Mike Snitzer
2013-09-27 6:07 ` Hannes Reinecke
2013-09-27 8:06 ` Hannes Reinecke [this message]
2013-09-27 8:37 ` Alasdair G Kergon
2013-09-27 13:52 ` Hannes Reinecke
2013-09-27 16:37 ` Frank Mayhar
2013-09-27 16:32 ` Frank Mayhar
2013-09-27 16:29 ` Frank Mayhar
2013-10-17 19:03 ` Frank Mayhar
2013-10-17 19:15 ` Mike Snitzer
2013-10-17 20:45 ` Frank Mayhar
2013-10-17 21:13 ` Mike Snitzer
2013-10-18 20:51 ` Frank Mayhar
2013-10-18 21:47 ` Alasdair G Kergon
2013-10-18 22:53 ` [RFC PATCH v2] dm mpath: add a " Mike Snitzer
2013-10-30 1:02 ` Mike Snitzer
2013-10-30 15:08 ` Frank Mayhar
2013-10-30 15:43 ` Mike Snitzer
2013-10-30 18:09 ` Frank Mayhar
2013-10-31 9:36 ` Junichi Nomura
2013-10-31 14:16 ` Frank Mayhar
2013-10-31 14:31 ` Alasdair G Kergon
2013-10-31 17:17 ` Frank Mayhar
2013-11-01 1:23 ` Junichi Nomura
2013-11-01 1:58 ` Junichi Nomura
2013-11-01 4:17 ` Junichi Nomura
2013-11-05 15:18 ` Frank Mayhar
2013-11-05 16:02 ` [RFC PATCH v3] " Frank Mayhar
2013-11-05 16:53 ` Mike Snitzer
2013-11-06 6:54 ` Hannes Reinecke
2013-11-06 15:43 ` Frank Mayhar
2013-11-06 19:21 ` Mike Snitzer
2013-11-07 1:03 ` Junichi Nomura
2013-10-31 14:59 ` [RFC PATCH v2] " Hannes Reinecke
2013-10-21 16:05 ` RFC for multipath " Benjamin Marzinski
2013-10-21 16:17 ` Frank Mayhar
2013-10-23 13:39 ` Benjamin Marzinski
2013-09-27 16:27 ` Frank Mayhar
2013-09-26 17:41 ` Mike Snitzer
2013-09-26 17:55 ` Frank Mayhar
2013-09-26 18:41 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52453C76.6060403@suse.de \
--to=hare@suse.de \
--cc=dm-devel@redhat.com \
--cc=fmayhar@google.com \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.