From: Neil Brown <neilb@suse.de>
To: device-mapper development <dm-devel@redhat.com>
Cc: lmb@novell.com
Subject: Re: RFC: multipath IO multiplex
Date: Sat, 6 Nov 2010 05:32:03 -0400 [thread overview]
Message-ID: <20101106053203.7e4ef435@notabene> (raw)
In-Reply-To: <20101105183946.GG25992@suse.de>
On Fri, 5 Nov 2010 19:39:46 +0100
Lars Marowsky-Bree <lmb@novell.com> wrote:
> Hi all,
>
> this is a topic that came up during our HA miniconference at LPC. I
> inherited the action item to code this, but before coding it, I thought
> I'd get some validation on the design.
>
> In a cluster environment, we occasionally have time critical IO - both
> read and writes, for a mix of via-disk heartbeating, or the exchange of
> poison pills.
>
> MPIO plays hell with this, since an IO could potentially experience very
> high latency during a path switch. Extending the timeouts to allow for
> this is reasonably impractical.
>
> However, our IO has certain properties that make it special - we have
> rather careful patterns, they don't overlap, they are effectively single
> page/single atomic write unit, and each node effectively writes to its
> own area.
>
> So the idea would be to, instead of relying on the active/passive access
> pattern, to send the IO down all paths in parallel - and reporting
> either the first success or the last failure.
Hi Lars,
the only issue that occurs to me is that if you want to report the first
success, then you need to copy the data to a private buffer before
submitting the write. Then wait for all writes to complete before freeing
the buffer. If you just return the first write the page would be unlocked
and so could be changed will another path was still writing it out.
Finding a way to signal 'write all paths sounds tricky. This flag needs to
be state of the filedescriptor, not the whole device, so it would need to be
an fcntl rather than an ioctl. And defining new fcntls is a lot harder
because they need to be more generic - you cannot really make them device
specific...
Might it make sense to configure a range of the device where writes always
went down all paths? That would seem to fit with your problem description
and might be easiest??
NeilBrown
>
> (Clearly, this only works for active/active arrays; active/passive
> setups still may have problems.)
>
> Doing this in user-space is somewhat icky; short of scanning the devices
> ourselves, or asking multipathd for each IO for the current list, we
> have no good way to do that. But the kernel obviously has the correct
> list at all times.
>
> So, I think a special IO flag for block IO (ioctl, open() flag on the
> device, whatever) that would cause dm-multipath to send the IO down all
> paths (and, as mentioned, report either the last failure or first
> success), seems to be the easiest way.
>
> How would you prefer such a flag to be implemented and passed in, and
> what do you think of the general use case?
>
>
> Regards,
> Lars
>
next prev parent reply other threads:[~2010-11-06 9:32 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-05 18:39 RFC: multipath IO multiplex Lars Marowsky-Bree
2010-11-06 9:32 ` Neil Brown [this message]
2010-11-06 11:51 ` Alasdair G Kergon
2010-11-06 16:57 ` Lars Marowsky-Bree
2010-11-07 10:30 ` Christophe Varoqui
2010-11-08 11:50 ` Lars Marowsky-Bree
2010-11-08 12:12 ` Alasdair G Kergon
2010-11-08 12:19 ` Lars Marowsky-Bree
2010-11-08 12:42 ` Hannes Reinecke
2010-11-08 12:56 ` Alasdair G Kergon
2010-11-08 14:18 ` Lars Marowsky-Bree
2010-11-06 17:03 ` Lars Marowsky-Bree
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101106053203.7e4ef435@notabene \
--to=neilb@suse.de \
--cc=dm-devel@redhat.com \
--cc=lmb@novell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).