From: Lars Marowsky-Bree <lmb@novell.com>
To: Neil Brown <neilb@suse.de>,
device-mapper development <dm-devel@redhat.com>
Subject: Re: RFC: multipath IO multiplex
Date: Sat, 6 Nov 2010 17:57:43 +0100 [thread overview]
Message-ID: <20101106165743.GE30809@suse.de> (raw)
In-Reply-To: <20101106115102.GF10171@agk-dp.fab.redhat.com>
On 2010-11-06T11:51:02, Alasdair G Kergon <agk@redhat.com> wrote:
Hi Neil, Alasdair,
thanks for the feedback. Answering your points in reverse order -
> > Might it make sense to configure a range of the device where writes always
> > went down all paths? That would seem to fit with your problem description
> > and might be easiest??
> Indeed - a persistent property of the device (even another interface with a
> different minor number) not the I/O.
I'm not so sure that would be required though. The equivalent of our
"mkfs" tool wouldn't need this. Also, typically, this would be a
partition (kpartx) on top of a regular MPIO mapping (that we want to be
managed by multipathd).
Handling this completely differently would complicate setup, no?
> And what is the nature of the data being written, given that I/O to one path
> might get delayed and arrive long after it was sent, overwriting data
> sent later. Successful stale writes will always be recognised as such
> by readers - how?
The very particular use case I am thinking of is the "poison pill" for
node-level fencing. Nodes constantly monitor their slot (using direct
IO, bypassing all caching, etc), and either can successfully read it or
commit suicide (assisted by a hardware watchdog to protect against
stalls).
The writer knows that, once the message has been successfully written,
the target node will either have read it (and committed suicide), or
been self-fenced because of a timeout/read error.
Allowing for the additional timeouts incurred by MPIO here really slows
this mechanism down to the point of being unusable.
Now, even if a write was delayed - which is not very likely, it's more
likely that some of the IO will just fail if indeed one of the paths
happens to go down, and this would not resubmit it to other paths -, the
worst that could happen would be a double fence. (If it gets written
after the node has cycled once and cleared its message slot; that would
imply a significant delay already, since servers take a bit to boot.)
For the 'heartbeat' mechanism and others (if/when we get around for
adding them), we could ignore the exact contents that have been written
and just watch for changes; worst, the node death detection will take a
bit longer.
Basically, the thing we need to get around is the possible IO latency in
MPIO, for things like poison pill fencing ("storage-based death") or
qdisk-style plugins. I'm open for other suggestions as well.
Regards,
Lars
--
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
next prev parent reply other threads:[~2010-11-06 16:57 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-05 18:39 RFC: multipath IO multiplex Lars Marowsky-Bree
2010-11-06 9:32 ` Neil Brown
2010-11-06 11:51 ` Alasdair G Kergon
2010-11-06 16:57 ` Lars Marowsky-Bree [this message]
2010-11-07 10:30 ` Christophe Varoqui
2010-11-08 11:50 ` Lars Marowsky-Bree
2010-11-08 12:12 ` Alasdair G Kergon
2010-11-08 12:19 ` Lars Marowsky-Bree
2010-11-08 12:42 ` Hannes Reinecke
2010-11-08 12:56 ` Alasdair G Kergon
2010-11-08 14:18 ` Lars Marowsky-Bree
2010-11-06 17:03 ` Lars Marowsky-Bree
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101106165743.GE30809@suse.de \
--to=lmb@novell.com \
--cc=dm-devel@redhat.com \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).