All of lore.kernel.org
 help / color / mirror / Atom feed
* Replacing DRBD use with RBD
@ 2010-05-04 23:46 Martin Fick
  2010-05-05  7:30 ` Alex Elsayed
  2010-05-05 20:00 ` Yehuda Sadeh Weinraub
  0 siblings, 2 replies; 8+ messages in thread
From: Martin Fick @ 2010-05-04 23:46 UTC (permalink / raw)
  To: ceph-devel

Hello,

I have a questions with respect to RADOS and RBD and the cluster monitor daemons.

1) Is there any chance that the cluster monitor protocol will be enhanced to work practically with only 2 monitor daemons?  I ask since this seems like it would allow a 2 node RBD based device to effectively replace a DRBD based device and yet be much more easily expandable to more nodes than DRBD.  Many HA systems (say telco racks) only have two nodes and it seems silly to miss out on the opportunity to be able to use RBD in those systems.

One suggestion I have would be to do this would be to use some of the same techniques that heartbeat uses to determine whether a node has gone down or if instead there is network segregation: a serial port connection, common ping nodes (such as a router)...

I suspect that if reliable 2 node operation were designed into RBD, it would eventually replace some of the uses of DRBD.


2) Is there any way of preventing two users of an RBD device from using the device concurrently?  Is there someway to create "locks" with RADOS that would die if a node dies?  If so, this would allow an RBD device to be safely mounted as a non distributed FS such as ext3 exclusively on one of many hosts.  This would open up the use of RBD devices for linux containers or linux vservers which could run on any machine in a cluster (similar to the idea of using it with kvm/qemu).

Thanks, I look forward to playing with RBD and ceph!

-Martin



      
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread
* Re: Replacing DRBD use with RBD
@ 2010-05-05 20:34 Martin Fick
  2010-05-06  5:10 ` Thomas Mueller
  0 siblings, 1 reply; 8+ messages in thread
From: Martin Fick @ 2010-05-05 20:34 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub; +Cc: ceph-devel

--- On Wed, 5/5/10, Yehuda Sadeh Weinraub <yehudasa@gmail.com> wrote:
> The problem is that the ceph monitors require a quorum in
> order to decide on the cluster state. The way the system 
> works right now, a 2-way monitor setup would be less stable 
> than a system with a single monitor since it wouldn't work
> whenever any of the two monitors crashes. 

Right, that is indeed not nice. :)

> A possible workaround would be to have a special case for a
> 2-way mon clusters, where it'd require a single mon for
> getting a majority. I'm not sure whether this is actually 
> feasible. As usual, the devil is in the details.

Yes. One simple way is to use a ping node.  If a node can
reach the ping node, but not its peer, it should be able
to assume "lone operation" and thus effectively degrade to
a single monitor situation temporarily. I guess my question
is, "is this something that the ceph project is 
potentially willing to support for OSDs?"

I suspect that also supporting dynamic reconfiguration:
http://en.wikipedia.org/wiki/Paxos_algorithm#Cheap_Paxos
would also help a great deal to make clusters more
adaptable.


> > One suggestion I have would be to do this would be to
> > use some of the same techniques that heartbeat uses to
> > determine whether a node has gone down or if instead there
> > is network segregation: a serial port connection, common
> > ping nodes (such as a router)...

> There is a heartbeat mechanism withing the mon cluster, and
> it's being used for the monitors to keep track of their peer
> status. It might be a good idea to add different configurable 
> types of heartbeats.

Yes, specifically, I meant by using some of the techniques
that the heartbeat project uses:

http://www.linux-ha.org/wiki/Heartbeat

Ideally (my suggestion,) they would make some of them 
available in a library so that other projects like 
RADOS could use them independently without having to 
rewrite them from scratch.



> > 2) Is there any way of preventing two users of an RBD
> > device from using the device concurrently?  ...
> 
> We were just thinking about the proper solution to this
> problem ourselves. There are a few options. One is to 
> add some kinds of locking mechanism to the osd, which
> would allow doing just that. E.g., a client would take 
> a lock, do whatever it needs to do, a second client 
> would try to get the lock but will be able to hold it only
> after the first one has released it. Another option would
> be to have the clients handle the mutual exclusion 
> themselves (hence not enforced by the osd) by setting 
> flags and leases on the rbd header.

I'm curious, do you mean a scheme such as writing the
name of the node "locking" the image along with a 
timestamp regularly to the header as a heartbeat?  
Along with some lock acquisition logic?

Thanks for the replies!

-Martin



      
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-05-06  5:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-04 23:46 Replacing DRBD use with RBD Martin Fick
2010-05-05  7:30 ` Alex Elsayed
2010-05-05 20:02   ` Alex Elsayed
2010-05-05 20:13     ` Yehuda Sadeh Weinraub
2010-05-05 20:59     ` Martin Fick
2010-05-05 20:00 ` Yehuda Sadeh Weinraub
  -- strict thread matches above, loose matches on Subject: below --
2010-05-05 20:34 Martin Fick
2010-05-06  5:10 ` Thomas Mueller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.