From: Hannes Reinecke <hare@suse.de>
To: device-mapper development <dm-devel@redhat.com>
Cc: Junichi Nomura <j-nomura@ce.jp.nec.com>,
Steffen Maier <maier@linux.vnet.ibm.com>,
Frank Mayhar <fmayhar@google.com>,
Alasdair Kergon <agk@redhat.com>,
Mike Snitzer <snitzer@redhat.com>
Subject: Re: [PATCH] dm-mpath: push back requests instead of queueing
Date: Tue, 12 Nov 2013 09:49:16 +0100 [thread overview]
Message-ID: <5281EB8C.9010800@suse.de> (raw)
In-Reply-To: <5281E403.5020005@suse.de>
On 11/12/2013 09:17 AM, Hannes Reinecke wrote:
> On 11/12/2013 08:48 AM, Junichi Nomura wrote:
>> On 11/08/13 18:02, Hannes Reinecke wrote:
>>> There is no reason why multipath needs to queue requests
>>> internally for queue_if_no_path or pg_init; we should
>>> rather push them back onto the request queue.
>>>
>>>
>>> This patch removes the internal queueing mechanism from dm-multipath.
>>
>> Hi Hannes,
>> thanks for working on this.
>>
>>> In doing so we can also remove the ->busy check as a requeue is identical
>>> to ->busy returning 'true' from the callers perspective. This simplifies
>>> the code by quite a lot.
>>
>> They are not identical.
>> ->busy returns true when underlying path cannot dispatch a request
>> immediately. In that case it's better to keep the request in queue
>> to allow merges. (It used to have real impact on buffered sequential
>> write + fsync workload, though the performance impact might become
>> smaller in recent kernels due to plugging change.)
>> Also ->busy can be used by upper layer (dm_table_any_busy_target).
>> So you can't just remove it.
>>
> But they are identical from the callers perspective:
> drivers/md/dm.c:dm_request_fn()
>
> if (ti->type->busy && ti->type->busy(ti))
> goto delay_and_out;
>
> clone = dm_start_request(md, rq);
>
> spin_unlock(q->queue_lock);
> if (map_request(ti, clone, md))
> goto requeued;
>
> BUG_ON(!irqs_disabled());
> spin_lock(q->queue_lock);
> }
>
> goto out;
>
> requeued:
> BUG_ON(!irqs_disabled());
> spin_lock(q->queue_lock);
>
> delay_and_out:
> blk_delay_queue(q, HZ / 10);
>
> So the only difference between ->busy and return 'DM_MAPIO_REQUEUE'
> is that 'busy' runs under the queue_lock.
>
> And yes, in theory ->busy might be used from any upper layer; but
> so far the only caller I've found _is_ in dm_request_fn().
> So removing doesn't do any harm.
>
> Unless I've misread something ...
>
>>> @@ -385,21 +378,15 @@ static int map_io(struct multipath *m, struct request *clone,
>>> (!m->queue_io && (m->repeat_count && --m->repeat_count == 0)))
>>> __choose_pgpath(m, nr_bytes);
>>>
>>> - pgpath = m->current_pgpath;
>>> -
>>> - if (was_queued)
>>> - m->queue_size--;
>>> + if (m->current_pgpath &&
>>> + m->pg_init_required && !m->pg_init_in_progress)
>>> + __pg_init_all_paths(m);
>>>
>>> + pgpath = m->current_pgpath;
>>> if ((pgpath && m->queue_io) ||
>>> (!pgpath && m->queue_if_no_path)) {
>>> - /* Queue for the daemon to resubmit */
>>> - list_add_tail(&clone->queuelist, &m->queued_ios);
>>> - m->queue_size++;
>>> - if ((m->pg_init_required && !m->pg_init_in_progress) ||
>>> - !m->queue_io)
>>> - queue_work(kmultipathd, &m->process_queued_ios);
>>> pgpath = NULL;
>>> - r = DM_MAPIO_SUBMITTED;
>>> + r = DM_MAPIO_REQUEUE;
>>
>> if/else sequence in map_io() might be easier to read if we do:
>>
>> if (pgpath) {
>> if (pg_ready(m)) {
>> ... // remap
>> r = DM_MAPIO_REMAPPED;
>> } else {
>> __pg_init_all_paths(m);
>> r = DM_MAPIO_REQUEUE;
>> }
>> } else { /* no path */
>> if (need_requeue(m))
>> r = DM_MAPIO_REQUEUE;
>> else
>> r = -EIO;
>> }
>>
>> where:
>> #define pg_ready(m) (!m->queue_io) /* or rename 'queue_io' */
>> #define need_requeue(m) (m->queue_if_no_path || __must_push_back(m))
>>
>> and move pg_init_required, etc. checks to __pg_init_all_paths().
>>
> True. Will be doing that.
>
>>> @@ -1241,7 +1168,8 @@ static void pg_init_done(void *data, int errors)
>>> m->queue_io = 0;
>>>
>>> m->pg_init_delay_retry = delay_retry;
>>> - queue_work(kmultipathd, &m->process_queued_ios);
>>> + if (!m->queue_io)
>>> + dm_table_run_queue(m->ti->table);
>>>
>>> /*
>>> * Wake up any thread waiting to suspend.
>>
>> process_queued_ios was responsible for retrying pg_init.
>> And when retrying, m->queue_io is still 0.
>> So don't we have to run queue unconditionally here
>> or call __pg_init_all_paths() directly?
>>
> In my rework I've _tried_ to separate both functions from
> process_queued_ios().
> But yes, you are right; I haven't considered pg_init_retry.
> Will be updating the patch.
>
Actually, I have. I just had a closer look at the patch,
and pg_init retry is handled, albeit differently than in
the original.
It now works like this:
->map_io() is called
-> calls '__switch_pg', which sets 'queue_io'
-> calls __pg_init_all_paths, which pushes activate_path
onto a workqueue
-> returns 'MAPIO_REQUEUE'
-> pg_init_done()
-> Checks pg_init_required
-> if non-zero some other I/O already
kicked off an 'activate_path',
so nothing to be done from our side
-> if zero we're calling kicking the queue
via blk_run_queue
And blk_run_queue() will be calling into ->request_fn,
which will pull requests off the queue.
So on the next request we're calling 'map_io', so the
entire game starts anew, retrying pg_init.
The only thing we're not handling properly is the
'pg_init_delay_retry', as for that we should've
started the queue with a certain delay, which
we currently don't. But that's easily fixable.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
next prev parent reply other threads:[~2013-11-12 8:49 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-08 9:02 [PATCH] dm-mpath: push back requests instead of queueing Hannes Reinecke
2013-11-12 7:48 ` Junichi Nomura
2013-11-12 8:17 ` Hannes Reinecke
2013-11-12 8:43 ` Junichi Nomura
2013-11-12 9:05 ` Hannes Reinecke
2013-11-12 10:00 ` Junichi Nomura
2013-11-12 10:17 ` Hannes Reinecke
2013-11-12 10:25 ` Junichi Nomura
2013-11-12 8:49 ` Hannes Reinecke [this message]
2013-11-12 10:09 ` Junichi Nomura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5281EB8C.9010800@suse.de \
--to=hare@suse.de \
--cc=agk@redhat.com \
--cc=dm-devel@redhat.com \
--cc=fmayhar@google.com \
--cc=j-nomura@ce.jp.nec.com \
--cc=maier@linux.vnet.ibm.com \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.