From: Philipp Reisner <philipp.reisner@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] How Locking in GFS works...
Date: Tue, 5 Oct 2004 21:37:27 +0200 [thread overview]
Message-ID: <200410052137.27492.philipp.reisner@linbit.com> (raw)
In-Reply-To: <200410041617.21258.philipp.reisner@linbit.com>
[-- Attachment #1: Type: text/plain, Size: 2586 bytes --]
Hi!
Please also look at the nice PDF!
> > > "Oh my, this is dirty locally too and unacked. We better arbitate now;
> > > ie one side wins and the other one is silently discarded."
9 Support shared disk semantics ( for GFS, OCFS etc... )
All the thoughts in this area, imply that the cluster deals
with split brain situations as discussed in item 6.
In order to offer a shared disk mode for GFS, we allow both
nodes to become primary. (This needs to be enabled with the
config statement net { allow-two-primaries; } )
Read after write dependencies
The shared state is available to clusters using protocol C
and B. It is not usable with protocol A.
To support the shared state with protocol B, upon a read
request the node has to check if a new version of the block
is in the progress of getting written. (== search for it on
active_ee and done_ee. [ Since it is on active_ee before the
RecvAck is sent. ] )
Global write order
The major pitfall is the handling of concurrent writes to the
same block. (Concurrent writes to the same blocks should not
happen, but we have to assume that it is possible that the
synchronisation methods of our upper layer [i.e. openGFS]
may fail.)
Without further handling concurrent writes to the same block
would get written on each node locally first, then sent
to the peer and then overwrite the local version on the peer.
In other words, each node would write its local version first,
and the peers version of the data.
Both nodes need to agree to _one_ order, in which such
conflicting writes should be carried out.
Proposed Solution
We arbitrary select one node (e.g. the node that did the first
accept() in the drbd_connect() function) and mark it withe the
discard-concurrent-write-flag.
The algorithm which is performed upon the reception of a
data packet.
1. Do we have a concurrent request? (i.e. Do I have a request
to the same block in my transfer log.) If not -> write now.
2. Have I already got an ACK packet for the concurrent
request ? (Has the request the RQ_DRBD_SENT bit already set)
If yes -> write the data from the data packet afterwards.
3. Do I have the "discard-concurrent-write-flag" ?
If yes -> discard the data packet and send an discard notify.
If no -> Write data from the data packet afterwards.
BTW, each time we have a concurrent write access, we print
a warning to the syslog, since this indicates that the layer
above us is broken!
[ see also GFS-mode-arbitration.pdf for illustration. ]
[-- Attachment #2: GFS-mode-options.pdf --]
[-- Type: application/pdf, Size: 9808 bytes --]
next prev parent reply other threads:[~2004-10-05 19:37 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-10-04 12:56 [Drbd-dev] How Locking in GFS works Philipp Reisner
2004-10-04 13:01 ` Lars Marowsky-Bree
2004-10-04 13:20 ` Lars Ellenberg
2004-10-04 13:41 ` Lars Marowsky-Bree
2004-10-04 13:26 ` Philipp Reisner
2004-10-04 13:49 ` Lars Marowsky-Bree
2004-10-04 14:09 ` Philipp Reisner
2004-10-04 14:17 ` Philipp Reisner
2004-10-04 15:12 ` Lars Ellenberg
2004-10-04 20:24 ` Lars Marowsky-Bree
2004-10-08 12:32 ` Philipp Reisner
2004-10-08 12:55 ` Lars Marowsky-Bree
2004-10-08 13:37 ` Philipp Reisner
2004-10-08 13:51 ` Lars Ellenberg
2004-10-11 7:12 ` Philipp Reisner
2004-10-11 10:09 ` Lars Ellenberg
2004-10-11 10:11 ` Lars Ellenberg
2004-10-11 12:28 ` Philipp Reisner
2004-10-11 12:41 ` Philipp Reisner
2004-10-05 19:37 ` Philipp Reisner [this message]
2004-10-05 19:39 ` Philipp Reisner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200410052137.27492.philipp.reisner@linbit.com \
--to=philipp.reisner@linbit.com \
--cc=drbd-dev@lists.linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.