From: Philipp Reisner <philipp.reisner@linbit.com>
To: drbd-dev@lists.linbit.com
Subject: Re: [Drbd-dev] How Locking in GFS works...
Date: Mon, 4 Oct 2004 15:26:15 +0200 [thread overview]
Message-ID: <200410041526.15189.philipp.reisner@linbit.com> (raw)
In-Reply-To: <20041004130158.GP1542@marowsky-bree.de>
On Monday 04 October 2004 15:01, Lars Marowsky-Bree wrote:
> On 2004-10-04T14:56:21, Philipp Reisner <philipp.reisner@linbit.com> wrote:
> > This is intended as food for thought on how we should design our
> > support for shared disk file systems.
>
> I'm still not sure what kind of special support you need. The only
> guarantee you need to provide is that after a barrier all reads on all
> nodes return the same data for those blocks affected by the flush.
>
> The shared disk file system itself will take care of issueing
> appropriate barrier and flushing the OS caches.
>
> Am I missing something? ;-)
>
If everything works (esp. the locking of the shared disk fs) no.
But just consider that the locking of the shared disk FS on
top of us is broken, and that it issues a write request to
the same block number on both nodes.
Then each node would write its copy first and the peers
version of the data at second to that block number.
=> We would have different data in this block on our
two copies. - And we would event know about it!
What would have happened on a real shared disk?
The real shared disk would have ordered in some order,
ond one of the writes would overwrite the other version.
(This is the basic design idea of proposed solution 1)
(For proposed solution2 the lock "granulaty" of the
shared disk FS is interesting...)
--snip from ROADMAP file--
global write order
As far as I understand the topic up to now we have two options
to establish a global write order.
Proposed Solution 1, using the order of a coordinator node:
Writes from the coordinator node are carried out, as they are
carried out on the primary node in conventional DRBD. ( Write
to disk and send to peer simultaneously. )
Writes from the other node are sent to the coordinator first,
then the coordinator inserts a small "write now" packet into
its stream of write packets.
The node commits the write to its local IO subsystem as soon
as it gets the "write-now" packet from the coordinator.
Note: With protocol C it does not matter which node is the
coordinator from the performance viewpoint.
Proposed Solution 2, use a dedicated LRU to implement locking:
Each extent in the locking LRU can have on of these states:
requested
locked-by-peer
locked-by-me
locked-by-me-and-requested-by-peer
We allow application writes only to extents which are in
locked-by-me* state.
New Packets:
LockExtent
LockExtentAck
Configuration directives: dl-extents , dl-extent-size
TODO: Need to verify with GFS that this makes sense.
--
: Dipl-Ing Philipp Reisner Tel +43-1-8178292-50 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Schönbrunnerstr 244, 1120 Vienna, Austria http://www.linbit.com :
next prev parent reply other threads:[~2004-10-04 13:25 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-10-04 12:56 [Drbd-dev] How Locking in GFS works Philipp Reisner
2004-10-04 13:01 ` Lars Marowsky-Bree
2004-10-04 13:20 ` Lars Ellenberg
2004-10-04 13:41 ` Lars Marowsky-Bree
2004-10-04 13:26 ` Philipp Reisner [this message]
2004-10-04 13:49 ` Lars Marowsky-Bree
2004-10-04 14:09 ` Philipp Reisner
2004-10-04 14:17 ` Philipp Reisner
2004-10-04 15:12 ` Lars Ellenberg
2004-10-04 20:24 ` Lars Marowsky-Bree
2004-10-08 12:32 ` Philipp Reisner
2004-10-08 12:55 ` Lars Marowsky-Bree
2004-10-08 13:37 ` Philipp Reisner
2004-10-08 13:51 ` Lars Ellenberg
2004-10-11 7:12 ` Philipp Reisner
2004-10-11 10:09 ` Lars Ellenberg
2004-10-11 10:11 ` Lars Ellenberg
2004-10-11 12:28 ` Philipp Reisner
2004-10-11 12:41 ` Philipp Reisner
2004-10-05 19:37 ` Philipp Reisner
2004-10-05 19:39 ` Philipp Reisner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200410041526.15189.philipp.reisner@linbit.com \
--to=philipp.reisner@linbit.com \
--cc=drbd-dev@lists.linbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.