All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philipp Reisner <philipp.reisner@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] How Locking in GFS works...
Date: Tue, 5 Oct 2004 21:37:27 +0200	[thread overview]
Message-ID: <200410052137.27492.philipp.reisner@linbit.com> (raw)
In-Reply-To: <200410041617.21258.philipp.reisner@linbit.com>

[-- Attachment #1: Type: text/plain, Size: 2586 bytes --]


Hi!

Please also look at the nice PDF!

> > > "Oh my, this is dirty locally too and unacked. We better arbitate now;
> > > ie one side wins and the other one is silently discarded."

9 Support shared disk semantics  ( for GFS, OCFS etc... )

    All the thoughts in this area, imply that the cluster deals
    with split brain situations as discussed in item 6.

  In order to offer a shared disk mode for GFS, we allow both
  nodes to become primary. (This needs to be enabled with the
  config statement net { allow-two-primaries; } )

 Read after write dependencies

  The shared state is available to clusters using protocol C
  and B. It is not usable with protocol A.

  To support the shared state with protocol B, upon a read
  request the node has to check if a new version of the block
  is in the progress of getting written. (== search for it on
  active_ee and done_ee. [ Since it is on active_ee before the 
  RecvAck is sent. ] )
  
 Global write order

  The major pitfall is the handling of concurrent writes to the
  same block. (Concurrent writes to the same blocks should not 
  happen, but we have to assume that it is possible that the
  synchronisation methods of our upper layer [i.e. openGFS] 
  may fail.)

  Without further handling concurrent writes to the same block
  would get written on each node locally first, then sent
  to the peer and then overwrite the local version on the peer.
  In other words, each node would write its local version first,
  and the peers version of the data.

  Both nodes need to agree to _one_ order, in which such 
  conflicting writes should be carried out.

  Proposed Solution

  We arbitrary select one node (e.g. the node that did the first
  accept() in the drbd_connect() function) and mark it withe the
  discard-concurrent-write-flag.

  The algorithm which is performed upon the reception of a 
  data packet.

  1. Do we have a concurrent request? (i.e. Do I have a request
     to the same block in my transfer log.) If not -> write now.
  2. Have I already got an ACK packet for the concurrent 
     request ? (Has the request the RQ_DRBD_SENT bit already set)
     If yes -> write the data from the data packet afterwards.
  3. Do I have the "discard-concurrent-write-flag" ?
     If yes -> discard the data packet and send an discard notify.
     If no -> Write data from the data packet afterwards.

  BTW, each time we have a concurrent write access, we print
  a warning to the syslog, since this indicates that the layer
  above us is broken!

  [ see also GFS-mode-arbitration.pdf for illustration. ]

[-- Attachment #2: GFS-mode-options.pdf --]
[-- Type: application/pdf, Size: 9808 bytes --]

  parent reply	other threads:[~2004-10-05 19:37 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-04 12:56 [Drbd-dev] How Locking in GFS works Philipp Reisner
2004-10-04 13:01 ` Lars Marowsky-Bree
2004-10-04 13:20   ` Lars Ellenberg
2004-10-04 13:41     ` Lars Marowsky-Bree
2004-10-04 13:26   ` Philipp Reisner
2004-10-04 13:49     ` Lars Marowsky-Bree
2004-10-04 14:09       ` Philipp Reisner
2004-10-04 14:17         ` Philipp Reisner
2004-10-04 15:12           ` Lars Ellenberg
2004-10-04 20:24             ` Lars Marowsky-Bree
2004-10-08 12:32             ` Philipp Reisner
2004-10-08 12:55               ` Lars Marowsky-Bree
2004-10-08 13:37                 ` Philipp Reisner
2004-10-08 13:51               ` Lars Ellenberg
2004-10-11  7:12                 ` Philipp Reisner
2004-10-11 10:09                   ` Lars Ellenberg
2004-10-11 10:11                   ` Lars Ellenberg
2004-10-11 12:28                     ` Philipp Reisner
2004-10-11 12:41                       ` Philipp Reisner
2004-10-05 19:37           ` Philipp Reisner [this message]
2004-10-05 19:39             ` Philipp Reisner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200410052137.27492.philipp.reisner@linbit.com \
    --to=philipp.reisner@linbit.com \
    --cc=drbd-dev@lists.linbit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.