From: "Stuart D. Gathman" <stuart@bmsi.com>
To: LVM general discussion and development <linux-lvm@redhat.com>
Cc: Phillip Susi <psusi@cfl.rr.com>
Subject: Re: [linux-lvm] LVM + raid + san
Date: Tue, 9 Nov 2010 17:15:32 -0500 (EST) [thread overview]
Message-ID: <Pine.LNX.4.64.1011091705030.27920@bmsred.bmsi.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1011071710110.20225@bmsred.bmsi.com>
On Sun, 7 Nov 2010, Stuart D. Gathman wrote:
> You'd have some serious locking issues. With RAID5, each server would have to
> lock each chunk before writing to it (which involves a read/modify/write
> cycle). This would create serious overhead. And you were complaining about
> SAN server overhead! :-) RAID5 *must* be centralized. Your scheme might work
> with RAID10, but then you'd still have to ensure that writes to mirror legs
> don't get out of order, with updates from multiple servers flying over the
> wire.
Actually, this would be an interesting driver to develop. If each
server primarily works on its own LV, then there shouldn't be much
lock contention. You would need a lock manager, and a special network
RAID driver that uses the lock manager to coordinate updates.
Each server would hold the lock for a chunk until it is needed by another
server. With the assumed mostly independent access, this should be
rare, and the locking should be optimized with that in mind. I.e.,
if you already hold the lock, just go ahead an update. If not, then
notify the holder via the lock manager, and wait until you do hold it.
You could probably avoid the lock manager by using a broadcast based protocol
(Who has chunk 12345678?)
Oh wait, this is an LVM list...
Is anything like this contemplated for devicemapper? There is already
locking involved with shared VGs on a traditional SAN.
It is an interesting idea to avoid a traditional SAN as a single point of
failure (the switch connecting the hosts and disks would still be a single
point of failure, but a switch is simpler than a SAN server). All hosts would
have to be trusted.
--
Stuart D. Gathman <stuart@bmsi.com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.
next prev parent reply other threads:[~2010-11-09 22:15 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-05 1:26 [linux-lvm] LVM + raid + san Phillip Susi
2010-11-05 4:39 ` Stuart D. Gathman
2010-11-07 0:51 ` Phillip Susi
2010-11-07 3:38 ` Eugene Vilensky
2010-11-07 4:03 ` allan
2010-11-07 19:55 ` Phillip Susi
2010-11-07 22:27 ` Stuart D. Gathman
2010-11-09 22:15 ` Stuart D. Gathman [this message]
2010-11-10 0:21 ` Phillip Susi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.1011091705030.27920@bmsred.bmsi.com \
--to=stuart@bmsi.com \
--cc=linux-lvm@redhat.com \
--cc=psusi@cfl.rr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).