dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* RFC: multipath IO multiplex
@ 2010-11-05 18:39 Lars Marowsky-Bree
  2010-11-06  9:32 ` Neil Brown
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Marowsky-Bree @ 2010-11-05 18:39 UTC (permalink / raw)
  To: dm-devel

Hi all,

this is a topic that came up during our HA miniconference at LPC. I
inherited the action item to code this, but before coding it, I thought
I'd get some validation on the design.

In a cluster environment, we occasionally have time critical IO - both
read and writes, for a mix of via-disk heartbeating, or the exchange of
poison pills.

MPIO plays hell with this, since an IO could potentially experience very
high latency during a path switch. Extending the timeouts to allow for
this is reasonably impractical.

However, our IO has certain properties that make it special - we have
rather careful patterns, they don't overlap, they are effectively single
page/single atomic write unit, and each node effectively writes to its
own area.

So the idea would be to, instead of relying on the active/passive access
pattern, to send the IO down all paths in parallel - and reporting
either the first success or the last failure.

(Clearly, this only works for active/active arrays; active/passive
setups still may have problems.)

Doing this in user-space is somewhat icky; short of scanning the devices
ourselves, or asking multipathd for each IO for the current list, we
have no good way to do that. But the kernel obviously has the correct
list at all times.

So, I think a special IO flag for block IO (ioctl, open() flag on the
device, whatever) that would cause dm-multipath to send the IO down all
paths (and, as mentioned, report either the last failure or first
success), seems to be the easiest way.

How would you prefer such a flag to be implemented and passed in, and
what do you think of the general use case?


Regards,
    Lars

-- 
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-11-08 14:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-05 18:39 RFC: multipath IO multiplex Lars Marowsky-Bree
2010-11-06  9:32 ` Neil Brown
2010-11-06 11:51   ` Alasdair G Kergon
2010-11-06 16:57     ` Lars Marowsky-Bree
2010-11-07 10:30       ` Christophe Varoqui
2010-11-08 11:50         ` Lars Marowsky-Bree
2010-11-08 12:12           ` Alasdair G Kergon
2010-11-08 12:19             ` Lars Marowsky-Bree
2010-11-08 12:42               ` Hannes Reinecke
2010-11-08 12:56               ` Alasdair G Kergon
2010-11-08 14:18                 ` Lars Marowsky-Bree
2010-11-06 17:03   ` Lars Marowsky-Bree

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).